Estimation of joint radiometric and geometric image deformations

ABSTRACT

Methods for estimating joint geometric and radiometric deformations relating two observations of the same object provide methods for identifying and matching images (as in face recognition) and for characterizing and determining the relationship between a pair of observations (as in target recognition).

RELATED APPLICATIONS

This application claims the benefit of U.S. Provisional Patent Application Ser. No. 61/076,078 filed Jun. 26, 2008, and U.S. Provisional Patent Application Ser. No. 61/076,084, filed Jun. 26, 2008, and is a continuation-in-part of U.S. patent application Ser. No. 12/177,669, filed Jul. 22, 2008 now U.S. Pat. No. 7,865,020, which is a continuation of U.S. patent application Ser. No. 10/981,943, filed Nov. 5, 2004 (now U.S. Pat. No. 7,406,197) which claims the benefit of U.S. Provisional Patent Application Ser. No. 60/518,006, filed Nov. 5, 2003 and U.S. Provisional Patent Application Ser. No. 60/604,745 filed Aug. 25, 2004. These foregoing applications are hereby incorporated by reference.

FIELD OF INVENTION

The present invention relates to methods for estimating joint geometric and radiometric images deformations relating two observations of the same object.

BACKGROUND OF THE INVENTION

Understanding different appearances of an object is an elementary problem in various fields. Since acquisition conditions vary (e.g., pose, illumination, acquisition system), the set of possible observations on a particular object is immense; therefore, the complicated task of characterizing and determining the relation between a pair of observations is crucial whether one is after the differences themselves (e.g., target tracking) or whether the interest is restricted to determining if two observations are of the same object (e.g., face recognition).

Image registration, and in general the problem of estimating transformations of observed objects, has been intensively studied for several decades. Results of this research have been applied in various fields such as remote sensing, medical imaging and computer vision, and to problems such as super-resolution, video compression, and many more.

The vast majority of studies focus on geometric-only registration; that is, alignment of the domain (coordinates) of images. Correspondingly, the term “registration” is commonly associated with geometry, which indeed is a challenging problem in itself. Intuitively, it seems that the simplest way to relate a pair of geometrically deformed images is by associating ordered sequences of some salient features, that can be located on both images (feature-based methods). These are usually selected to be line or point features, and thus, much of the intensity information contained in the image is discarded. To make this approach feasible, the problem of finding the correspondence between the features on both images must be solved first. Therefore, this approach may only be used when the features are easily detectable, the deformation is close enough to the identity, the number of features is relatively small (yet large enough to allow for a meaningful estimation), and there is a strong contextual evidence to guide the solution to the correspondence problem. Yet, in many problems the feature points are not easily identifiable and their number may be large, in which case the correspondence problem rapidly becomes very difficult to solve.

Featureless methods (area methods) avoid the correspondence problem by directly employing the intensity information of the image in order to establish an estimate of the geometric deformation, without identifying any features. Among such methods are those based on “correlation-like”, Fourier, mutual information and optical flow principles. Typically, these methods lead to the utilization of some iterative optimization procedure in order to estimate the deformation. As such, they are applicable only when the deformations are small, which lowers the risk of obtaining a local minimum as the solution. Explicit solution methods, which do not employ the minimization of a penalty function of some sort, are commonly restricted to a relatively small family of transformations. For example, there are explicit methods in the cases of translation only, rotation only, or scale only, but the solution becomes (partially) optimization based once combinations of translation, rotation and scaling are introduced.

On the other hand, various recent studies have focused on radiometry-only registration; that is, alignment of the range (values/intensities) of geometrically-aligned images. This problem is much simpler since images are geometrically pre-aligned, and thus pointwise correspondence inherently exists. Thus, straightforward methods may be used to combine images captured in different optical settings into a single image of high dynamic range (HDR).

Nonetheless, it seems that the problem of deriving a registration procedure in the presence of combined radiometric and geometric variations (i.e., both domain and range) has received much less attention. In this case, employing geometric registration methods without taking the radiometric variation into account is clearly far from optimal: featureless geometric registration methods are extremely sensitive to intensity variations and typically fail in the presence of some non-negligible radiometric transformation relating the intensities of the observations. This therefore leaves feature-based methods as a sole option, since they are less sensitive to radiometric variations. These methods, however, are not always applicable, as mentioned above. For example, in medical imaging, cross-modality geometric registration procedures are prone to failure due to significant, difficult-to-model, differences in pixel (voxel) intensities; this has led researchers to the employment of various computationally demanding variational methods for performing the registration. In the geometric registration of optical imaging, it is customary to evade the radiometric effects by using “radiometry-invariant” procedures (usually feature-based, as previously mentioned). The physical understandings and invariancy principles of the color constancy framework have also been utilized in attempt to minimize the effects of such radiometric variations (these, however, usually assume linear radiometric effects). Another approach, that relies on the principals of the aforementioned radiometry-only registration, overcomes radiometric phenomena in the registration of optical images (and, in fact, benefits from it) by using special optical apparatus (spatially varying filters attached to the camera).

SUMMARY OF THE INVENTION

The present invention is concerned with the problem of estimating joint geometric and radiometric image deformations. Observations of an object are assumed to simultaneously undergo an affine transformation of coordinates and a non-linear mapping of the intensities.

In the geometric aspect, the case of affine transformations of coordinates is basic and provides a “first-order” approximation to more complex cases (such as “small” projective deformations, etc.). As such, registration of images subject to geometric affine transformation has gained significant interest. In the radiometric aspect, the type of global variability we consider is often referred to as intensity mapping. It naturally appears in the important case of single-modal registration, where non-linearities are typically introduced by an image acquisition system as the overall non-linearity of its various components (sensors, amplifiers, etc.). In optical imaging, for example, such non-linearities are characterized by the camera response function (CRF). Notably, such non-linearities are sometimes deliberately introduced in hardware design for various reasons. It has also been suggested that this type of global mapping may model the intensity variations in certain medical imaging multi-modal schemes.

More specifically, let g:R^(m)→R be a function, representing a multi-dimensional signal (e.g., m=2 in the case of images). Consider an observation h on g of the form h(x)=Q(g(A(x))), where Q is invertible and A is affine. The right-hand composition of g with A (composition from within) can be thought of as a spatial/time deformation (i.e., a deformation of the domain—the coordinate system), while its left-hand composition with Q (composition from without) can be seen as a memoryless non-linear input/output system applied to the signal's amplitude. Hence, in terms of image formation, such model physically corresponds to a simultaneous deformation of both geometry and radiometry. It should also be mentioned that in this case, the set of allowed deformations we consider, naturally takes the convenient structure of a (product) group. Taking this point of view, in the absence of noise, the joint registration problem is formulated as follows: given two functions (images/signals) g and h, we are to find, if one exists, a pair (Q,A) such that h(x)=Q(g(A(x))).

As previously mentioned, the difficulty of the joint geometric-radiometric estimation problem lead to the current state where only a few attempts have been made to solve it. The lack of point-wise correspondence (due to the geometric transformation) and the lack of intensity-wise alignment (due to the radiometric mapping) does not allow for a simple direct usage of the intensity information of the images. Seemingly, the geometric and radiometric problems are strongly coupled and may not be answered separately. As such, straightforward approaches for solving this problem typically lead to a high-dimensional non-linear non-convex optimization problem. Only a few works have explicitly modeled joint geometric-radiometric deformations. Indeed, among these, most evade the inherent non-linearity of this estimation problem through linear approximation and/or a variational optimization-based approaches.

An exception is the work of Candocia, whereby the joint registration of images in domain as well as range is accomplished using piecewise linear comparametric analysis; the camera's non-linear comparametric function is approximated with a constrained piecewise linear model; the registration model is then expanded using a first order Taylor expansion (of the image), resulting in a linear estimation problem. Being based on first order Taylor expansion, such solutions are restricted by the (implicit) assumptions of small geometric deformations and differentiability of the images, required for the approximation to be reasonable. Although such assumptions may sometimes be acceptable (for example, in the mosaicing of an image sequence shown as an example in F. M. Candocia, Jointly registering images in domain and range by piecewise linear comparametric analysis, IEEE Transactions on Image Processing, 12(4):409-419, April 2003), they are restrictive in general; an attempt to employ a Taylor based solution in the presence of a large geometric transformation (e.g., the simple case of a 180 degrees rotation) will certainly fail. Furthermore, even when such methods may be employed, the solution is approximated.

The present invention provides a method for solving the joint estimation problem in terms of an equivalent linear estimation problem. In contrast to many featureless estimation methods, and in contrast with the joint registration method proposed by Candocia, the method we describe is neither approximation-based nor does it have strong assumptions on the model (e.g., differentiability of the images). Also, no assumptions are made as to the magnitude of the geometric and radiometric deformations relating the observations. The solution we propose is explicit and thus computationally efficient. Moreover, in the absence of noise it is shown to be exact (not approximated), regardless of the deformations' magnitudes.

Below we rigorously formulate the problem addressed in this disclosure. We then discuss the problem in the absence of noise; the underlying algebraic structure of the problem and the notion of sample distribution are exploited in order to decouple the problem into two independent estimation problems: one for the unknown radiometric deformation and the other for the unknown geometric deformation. Each of those is then equivalently reformulated as a linear estimation problem. We then illustrate the estimation scheme using a synthetic example.

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 illustrates the problem description (where different non-linear intensity mappings are associated with each of the color channels of the image).

FIG. 2 shows “flaming fractals”: the template image g (left) and the observation image h (right).

FIG. 3 shows the sample distributions of the template image, Tg (left) and of the observation image, Th (right).

FIG. 4 illustrates the radiometric deformations Q_(R), Q_(G) and Q_(B) of the three color channels (solid lines) and their estimates (dashed lines).

FIG. 5 shows radiometry estimation: radiometrically-registered template—the result of applying the estimated radiometric mappings Q_(R), Q_(G) and Q_(B) to the template image g (left); the observation image h (right).

FIG. 6 illustrates the duality property of left-right compositions applied to monotone functions.

FIG. 7 shows the template (left) and a sample observation (right) at −3 dB.

FIG. 8 shows the radiometric deformation Q and its estimates (based on a single realization). Estimates were obtained by using the solution under the assumption that the observation is noisefree (dashed-dotted line), and by the estimator obtained under the assumption that the observations are noise corrupted (dashed line).

FIG. 9 shows a radiometry estimation—performance comparison.

FIG. 10 shows the error in estimating the affine deformation A.

FIG. 11 shows a sample of the experiment dataset. A total of over 10,000 images were collected at different combinations of geometry (rotation angle) and radiometry (illumination power and shutter speed). The chosen template image is shown at the center.

FIG. 12 illustrates a pair of images taken at different illumination power and shutter speed settings. 78% @ 0.01 sec. (left) 48% @ 0.3 sec. (right).

FIG. 13 shows the joint histograms and estimates of Q_(R), Q_(G) and Q_(B) of a pair of images taken at different illumination power and shutter speed settings. In the bottom-right are plotted the estimates Q_(R), Q_(G) and Q_(B).

DETAILED DESCRIPTION OF THE INVENTION

1(a) Statement of the Problem

We begin by informally stating the problem. Suppose we are given a single observation h, about a known m-dimensional signal g, of the form h(x)=Q(g(A(x)))+η(x), where Q a non-linear mapping of the intensities (radiometry); A is an affine (geometric) transformation; and η is an additive noise. We study the problem of estimating the joint radiometric and geometric deformations, which amounts to the estimation of Q and A, as illustrated by the simple block diagram presented in FIG. 1.

More formally, let A:R^(m)→R^(m) be an affine transformation of coordinates, that is, A:x

Ax+c where A∈R^(m×m) is non-singular and c∈R^(m). A shall represent the geometric deformation. Let Q:R→R be an increasing invertible function, representing the non-linear radiometric deformation. Let us further assume that Q(0)=0.

Denote by B_(c)(R^(m)) the space of bounded, compactly supported, Lebesgue measurable functions from R^(m) to R. Let g∈B_(c)(R^(m)) and let {η(x):x∈R^(m)} be a real-valued i.i.d. random field with a known distribution function F_(η).

Throughout, we shall use ∘ to denote the composition of functions, and supp{f} shall be used to denote the support of a function f, i.e., the closure of the set where f does not vanish. With these notations, the problem addressed is the following:

Given the known function g and a single measurement (observation) h of the form h(x)=[Q∘g∘A](x)+η(x)  (1.1) for all x∈supp{Q∘g∘A}, find an estimate for the left-hand composition Q and the affine transformation A.

Remark 1. The restriction x∈supp{Q∘g∘A} may be interpreted as the assumption of a known segmentation.

Remark 2. An important special case of (1.1) is the additive white Gaussian noise (AWGN) model, where {η(x):x∈R^(m)} is also assumed to be zero mean Gaussian with a known variance σ_(η) ².

Remark 3. Since Q is invertible, the recovery (estimation) of Q or of its inverse Q⁻¹ are equivalent. Similarly, estimating A or A⁻¹ are equivalent.

1(b) The Noiseless Case

In this section we discuss the noiseless case, that is, where η≡0 in (1.1). Hence, the problem model becomes h=Q∘g∘A,  (1.2) where to simplify the notations we omit the argument of the functions.

In the following, we show that the joint problem may be decoupled into two simpler problems in the unknown left-hand composition Q and affine transformation A, respectively.

In order to do so, let us first introduce the basic transformation studied throughout this discussion. Denote by λ the Lebesgue measure on R^(m), the m-dimensional Euclidean space. We define the sample distribution transformation T on B_(c)(R^(m)) by

$\begin{matrix} {{{\lbrack{Th}\rbrack(t)} = \frac{\lambda\left\{ {{x \in {{supp}\left\{ h \right\}}}:{{h(x)} \leq t}} \right\}}{\lambda\left\{ {{supp}\left\{ h \right\}} \right\}}},{h \in {{B_{c}\left( R^{m} \right)}.}}} & (1.3) \end{matrix}$

The sample distribution may be thought of as a continuous “cumulative histogram” of a function. The next simple lemma demonstrates the role of T as a “distribution” transformation and elaborates on some of its properties with respect to certain right- and left-hand compositions.

Lemma 1. For a given function g∈B_(c)(R^(m)), the following statements hold:

1. The function G(t)=[Tg](t) is a distribution function. Furthermore, the support of the distribution G(t) is bounded, in the following sense:

-   -   (a) G(t)=0 for t<inf_(x)g(x).     -   (b) G(t)=1 for t>sup_(x)g(x).

2. T is invariant under right-hand affine compositions: T(g∘A)=Tg for any non-singular affine transformation A:R^(m)→R^(m).

3. T(W∘g)=[Tg]∘W⁻¹ for any strictly increasing continuous function W:R→R such that W(0)=0.

Proof. Part (1) is immediate. Notice that supp{h∘A}=A⁻¹(supp{h}) ; this is simply since {x:h(A(x))=0}={A⁻¹(y):h(y)=0}. Thus, using the properties of the Lebesgue measure, we have λ{supp{h∘A}}=λ{A ⁻¹(supp{h})}=|A ⁻¹|λ{supp{h}}, where |A⁻¹| denotes the determinant of the transformation A⁻¹. Next, notice that T admits the following equivalent integral form

$\begin{matrix} {{\lbrack{Th}\rbrack(t)} = {\frac{1}{\lambda\left\{ {{supp}\left\{ h \right\}} \right\}}{\int_{{supp}{\{ h\}}}{\left\lbrack {\chi_{({{- \infty},t}\rbrack} \circ h} \right\rbrack(x){{\mathbb{d}{\lambda(x)}}.}}}}} & (1.4) \end{matrix}$ Set y=A(x), thus x=A⁻¹(y) and dλ(x)=|A⁻¹|dλ(y). Hence, by (1.4) and a change of variables, we have

$\begin{matrix} {{\left\lbrack {T\left( {h\; \circ A} \right)} \right\rbrack(t)} = {\frac{1}{\lambda\left\{ {{supp}\left\{ {h \circ A} \right\}} \right\}}{\int_{{supp}{\{{h \circ A}\}}}{{\chi_{({{- \infty},t}\rbrack}\left( {h\left( {A(x)} \right)} \right)}{\mathbb{d}{\lambda(x)}}}}}} \\ {= {{\frac{1}{{A^{- 1}}\lambda\left\{ {{supp}\left\{ h \right\}} \right\}}{\int_{{supp}{\{ h\}}}{{\chi_{({{- \infty},t}\rbrack}\left( {h(y)} \right)}{A^{- 1}}{\mathbb{d}{\lambda(y)}}}}} =}} \\ {\lbrack{Th}\rbrack(t)} \end{matrix}$ for all t, and thus (2) is proved. Lastly, since W is strictly increasing and W(0)=0 we have that supp{W∘h}=supp{h}. Hence, for all t we have

$\begin{matrix} {{\left\lbrack {T\left( {W \circ h} \right)} \right\rbrack(t)} = \frac{\lambda\left\{ {{x \in {{supp}\left\{ {W \circ h} \right\}}}:{{\left\lbrack {W \circ h} \right\rbrack(x)} \leq t}} \right\}}{\lambda\left\{ {{supp}\left\lbrack {W \circ h} \right\}} \right\}}} \\ {= {\frac{\lambda\left\{ {{x \in {{supp}\left\{ h \right\}}}:{{h(x)} \leq {W^{- 1}(t)}}} \right\}}{\lambda\left\{ {{supp}\left\{ h \right\}} \right\}} =}} \\ {{\left\lbrack {\lbrack{Th}\rbrack \circ W^{- 1}} \right\rbrack(t)},} \end{matrix}$ thus (3) is proved.

In the following we show that properties (2) and (3) above are the key in enabling an elegant solution to the joint registration problem. The next two subsections present two complementary approaches, in which either the radiometry, Q, or the geometry, A, is first estimated.

1(b)(i) Radiometry-First Approach

In this subsection, by directly employing the properties of the transformation T, a radiometry-first estimation scheme is derived. By simply applying T to the (noiseless) relation (1.2) and using the above properties, we obtain the following functional relation Th=T(Q∘g∘A)=T(Q∘g)=[Tg]∘Q ⁻¹.  (1.5) Hence, the following corollary may be stated: Corollary 2. Let H(t)=[Th](t) and G(t)=[Tg](t). Then, for all t∈R the following relation holds H(t)=[G∘Q ⁻¹](t)=G(Q ⁻¹(t)).  (1.6)

We thus conclude that the functions H and G are related by a right-hand composition Q⁻¹. Hence, using the transformation T we have “converted” a functional relation expressed by a left-hand composition (i.e., “radiometric deformation”) into a new functional relation expressed by a right-hand composition (i.e., “geometric deformation”). In fact, equation (1.6) describes a new, one-dimensional, “time-domain” registration problem. Hence, any suitable (parametric or non-parametric) registration method may be used to recover (estimate) Q⁻¹.

“Time-domain” registration problems of the type (1.6) have gained substantial interest in many research fields. Considered as key problems (for example in speech or hand-writing recognition), many solution methods have been proposed. As such, these methods (e.g., DTW and modified-DTW) may be used as is, or with minimal adaptation, in order to solve for in the above case. Commonly, however, these methods typically lead to optimization-based techniques.

Nevertheless, moment-like non-linear functionals may be utilized to produce linear constraints on Q. Thus, enabling the explicit recovery of Q⁻¹ in terms of solving a linear system of equations.

Namely, suppose that Q is also continuously differentiable. In this case, (1.6) is an estimation problem of the form considered in J. M. Francos, R. Hagege, and B. Friedlander, Estimation of multidimensional homeomorphisms for object recognition in noisy environments, Signals, Systems and Computers, 2003. Conference Record of the Thirty-Seventh Asilomar Conference on, 2:1615-1619, 2003. However, since H and G are distribution functions, they are not compactly supported (as functions). Thus, further “compactification” is required in order to employ the method proposed in Francos et al. for the estimation of homeomorphic deformations of compactly supported signals.

Let ε>0 be some arbitrarily small number and define

$\begin{matrix} {{c_{ɛ}(t)} = \left\{ \begin{matrix} {t,{t < {1 - ɛ}}} \\ {0,\text{elsewhere}} \end{matrix} \right.} & (1.7) \end{matrix}$ Next, let {tilde over (H)}=c_(ε)∘H and {tilde over (G)}=c_(ε)∘G. By left composing c_(ε) on both sides of (1.6), we find that

$\begin{matrix} \begin{matrix} {\overset{\sim}{H} = {{c_{ɛ} \circ H} = {c_{ɛ} \circ \left( {G \circ Q^{- 1}} \right)}}} \\ {= {{\left( {c_{ɛ} \circ G} \right) \circ Q^{- 1}} = {{\overset{\sim}{G} \circ Q^{- 1}}.}}} \end{matrix} & (1.8) \end{matrix}$ Thus, {tilde over (H)} and {tilde over (G)} are bounded, compactly supported, Lebesgue measurable functions from R to itself, related by the right-hand composition Q⁻¹, which, by assumption, has a continuously differentiable inverse.

Let {e_(i)} be a countable basis of L₂(supp{{tilde over (H)}}). Since, by assumption Q′ is continuous, it is in L₂(supp{{tilde over (H)}}) and can be represented as

$\begin{matrix} {{Q^{\prime}(t)} = {\sum\limits_{i}{b_{i}{{e_{i}(t)}.}}}} & (1.9) \end{matrix}$ Using the estimation algorithm proposed in [20], any finite order model of the type (1.9) can now be solved for the coefficients {b_(i)} by means of solving a system of linear equations.

Namely, let w:R→R be a Lebesgue measurable function that vanishes at zero. By applying w to (1.8) as left composition and integration we obtain

$\begin{matrix} {{\int_{R}{{w\left( {\overset{\sim}{H}(t)} \right)}{\mathbb{d}t}}} = {\int_{R}{{w\left( {\overset{\sim}{G}\left( {Q^{- 1}(t)} \right)} \right)}{{\mathbb{d}t}.}}}} & (1.10) \end{matrix}$ By the change of variables τ=Q⁻¹(t), we have dt=Q′(τ)dτ. Hence, using the parameterization (1.9), under the finite sum assumption, we find that

$\begin{matrix} {{\int_{R}{{W\left( {\overset{\sim}{G}\left( {Q^{- 1}(t)} \right)} \right)}{\mathbb{d}t}}} = {{\int_{R}{{w\left( {\overset{\sim}{G}(\tau)} \right)}{Q^{\prime}(\tau)}{\mathbb{d}\tau}}} = {\sum{b_{i}{\int_{R}{{e_{i}(\tau)}{w\left( {\overset{\sim}{G}(\tau)} \right)}{{\mathbb{d}\tau}.}}}}}}} & (1.11) \end{matrix}$ If so, for any choice of function w:R→R a linear constraint in the unknown parameters {b_(i)} may be constructed as follows

$\begin{matrix} {{{\int_{R}{w \circ \overset{\sim}{H}}} = {\left\lbrack {\int_{R}{{e_{1}\left( {w \circ \overset{\sim}{G}} \right)}\mspace{20mu}\ldots{\int_{R}{e_{M}\left( {w \circ \overset{\sim}{G}} \right)}}}} \right\rbrack\begin{bmatrix} b_{1} \\ \vdots \\ b_{M} \end{bmatrix}}},} & (1.12) \end{matrix}$ where M is the model order of the expansion of Q′ in (1.9).

Choosing w's that produce linearly independent constraints lets us establish a linear system of equations in the parameters {b_(i)}. It is possible to show that different w's are “almost always” linearly independent, yielding an exact solution for Q′ in the absence of noise; in fact, a method for optimal selection of such constraints in the presence of model mismatch (with respect to prescribed optimality criteria) can be derived.

Lastly, since Q(0)=0, Q can be easily obtained by integration, which completes the estimation of the mapping Q.

Based on the conclusions in Francos et al. we conclude that if the derivative of Q admits a finite order representation in the form of (1.9), the solution for the radiometric deformation Q is completely determined and exact.

Once the radiometric deformation Q (or equivalently its inverse Q⁻¹) is recovered, (1.2) may be rewritten as

$\begin{matrix} {h = {{\underset{\underset{known}{︸}}{\left( {Q \circ g} \right)} \circ A}.}} & (1.13) \end{matrix}$ Hence, the problem reduces to a strictly geometric problem, where A may be recovered (see further discussion in 1(b)(iii) below). 1(b)(ii) Geometry-First Approach

In this subsection, the properties of the transformation T given in Lemma 1 are used to derive a geometry-first estimation scheme. As previously shown, T converts the joint problem (1.2), in the unknowns Q and A, to a “new” problem in a single unknown, Q⁻¹. In order to obtain a symmetric result in A, let us define an auxiliary operator R on B_(c)(R^(m)) by Rh=[Th]∘h−[Th](0). Next, apply R to the basic (noiseless) relation h=Q∘g∘A given in (1.2). Notice that [Th](0) is a constant function (over all of R^(m)), and thus [Th](0)∘A=[Th](0). Hence, by employing Corollary 1.2 and since Q(0)=0 we have

$\begin{matrix} {{Rh} = {{{\lbrack{Th}\rbrack \circ h} - {\lbrack{Th}\rbrack(0)}} = {{{\left( {\lbrack{Tg}\rbrack \circ Q^{- 1}} \right) \circ \left( {Q \circ g \circ A} \right)} - {\lbrack{Tg}\rbrack\left( {Q^{- 1}(0)} \right)}} = {{{\lbrack{Tg}\rbrack \circ g \circ A} - {\lbrack{Tg}\rbrack(0)}} = {{\left( {{\lbrack{Tg}\rbrack \circ g} - {\lbrack{Tg}\rbrack(0)}} \right) \circ A} = {{\lbrack{Rg}\rbrack \circ A}.}}}}}} & (1.14) \end{matrix}$ Thus, the following corollary may be stated: Corollary 3. Let H(x)=[Rh](x) and G(x)=[Rg](x). Then, for all x∈R^(m) the following relation holds H(x)=[G∘A](x)=G(A(x)).  (1.15)

Hence, R (which has been defined in terms of T) has converted the joint problem (1.2), in the unknowns Q and A, to a “new” problem in a single unknown, A.

We thus conclude that the functions H and G are related by a right-hand composition A. Hence, by applying the operator R we have eliminated the left composition Q, representing the radiometric deformation, in the basic relation (1.2). If so, equation (1.15) describes a new, strictly geometric, affine registration problem. Hence, any suitable registration method may be used to recover (estimate) A.

As mentioned above, geometric-only registration, i.e., the problem of registering signals that vary by transformations of the domain, is a field of active research. In particular, a vast number of methods have been proposed in the (fundamental) case of affine geometric transformations. As with the case of “time-domain” registration problems, discussed above, each of these methods may be used, as is or with minimal adaptation, in order to solve for A in the above case.

Whereas the majority of the solution methods are either approximate or optimization-based, only a few explicit linear methods have been proposed. In particular, in R. Hagege and J. M. Francos, Parametric estimation of two-dimensional affine transformations, Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on, 3:iii-305-308, 2004, it is shown that (1.15) may be reformulated as a linear system of equations in A. The method derived in Hagage, et al. resembles the method discussed in above for the parametric estimation of homeomorphisms; for the completeness of the discussion, its outline is given as follows:

Let A(x)=Ax+c. For now, let us assume that |A|=det(A)=k is known. Let w:R→R be a measurable function that vanishes at zero. By applying w to (1.15) as left composition, multiplication by x and integration (as in the calculation of a first order moment) we obtain

$\begin{matrix} {{\int_{R^{m}}{{{xw}\left( {H(x)} \right)}{\mathbb{d}x}}} = {\int_{R^{m}}{{{xw}\left( {G\left( {{Ax} + c} \right)} \right)}{{\mathbb{d}x}.}}}} & (1.16) \end{matrix}$ Denote b=−A⁻¹c. By the change of variables y=Ax+c, we have x=A⁻¹y+b and dx=|A⁻¹|dy=k⁻¹dy (recall that |A|=k), yielding

$\begin{matrix} {{\int_{R^{m}}{{{xw}\left( {G({Ax})} \right)}{\mathbb{d}x}}} = {{\int_{R^{m}}{\left( {{A^{- 1}y} + b} \right){w\left( {G(y)} \right)}k^{- 1}{\mathbb{d}y}}} = {{k^{- 1}A^{- 1}{\int_{R^{m}}{{{yw}\left( {G(y)} \right)}{\mathbb{d}y}}}} + {k^{- 1}b{\int_{R^{m}}{{w\left( {G(y)} \right)}{{\mathbb{d}y}.}}}}}}} & (1.17) \end{matrix}$

By putting (1.17) in matrix form, it is clear that for any choice of function w:R→R a linear constraint in the unknown entries of A⁻¹ and b may be constructed as follows

$\begin{matrix} {\underset{\underset{known}{m \times 1}}{\underset{︸}{k{\int_{R^{m}}{x\left( {w \circ H} \right)}}}} = {\underset{\underset{\underset{unknown}{m \times {({m + 1})}}}{︸}}{\begin{bmatrix} A^{- 1} & b \end{bmatrix}} \cdot {\underset{\underset{\underset{known}{{({m + 1})} \times 1}}{︸}}{\begin{bmatrix} {\int_{R^{m}}{x\left( {w \circ G} \right)}} \\ {\int_{R^{m}}\left( {w \circ G} \right)} \end{bmatrix}}.}}} & (1.18) \end{matrix}$

Choosing w's that produce linearly independent constraints lets us establish a linear system of equations in the entries of A⁻¹ and b. It is possible to show that different w's are “almost always” linearly independent, yielding an exact solution in the absence of noise; in fact, as mentioned before, a method for optimal selection of such constraints in the presence of model mismatch (with respect to prescribed optimality criteria) is derived in R. Hagege and J. M. Francos, Parametric estimation of multi-dimensional affine transformations in the presence of noise: a linear solution, In Proc. IEEE/SP 13th Workshop on Statistical Signal Processing, pages 55-58, 2005 (the details are out of the scope of this work).

Relaxing the assumption that |A|=k is known is straightforward: again, by simply changing variables, it is easy to see that for any measurable function w:R→R that vanishes at zero we have

$\begin{matrix} {{A} = {\frac{\int_{R^{m}}{w \circ G}}{\int_{R^{m}}{w \circ H}}.}} & (1.19) \end{matrix}$ Hence, |A|=k may be calculated in advance.

Once the geometric transformation A is determined, (1.2) may be rewritten as

$\begin{matrix} {h = {{Q \circ \underset{\underset{known}{︸}}{\left( {g \circ A} \right)}}.}} & (1.20) \end{matrix}$ Hence, the problem reduces to a strictly radiometric problem, where Q may be recovered (see further discussion in 1(b)(iii) below). 1(b)(iii) Joint Registration—Concluding Results

This short subsection concludes the results derived for solving the problem of joint registration in the case of noiseless measurement. In 1(b)(i) and 1(b)(ii) above we have presented two complementary approaches for recovering the unknown radiometric and geometric transformations from the joint problem. These are summarized as follows:

Recall that the noiseless joint registration problem is h=Q∘g∘A, where the unknowns are Q and A. Using the transformations T and R (which is defined in terms of T) it is possible to symmetrically decouple this problem into two strictly geometric problems in each of the unknowns (Corollaries 2 and 3): [Th]=[Tg]∘Q ⁻¹, [Rh]=[Rg]∘A.

Hence, a solution to the joint problem may be obtained by separately solving the latter two problems for the unknowns Q⁻¹ and A, respectively. Accordingly, explicit parametric solution methods, compatible with either the estimation of Q⁻¹ or A, have been presented. It has been also noted that once one of these new problems is solved, the original problem may be reduced to a radiometric- or geometric-only problem—again solvable by the same means.

Moreover, if the conditions in 1(b)(i) and 1(b)(ii) are satisfied, the overall solution for both the geometric and the radiometric deformations, Q and A, is completely determined and exact (i.e., not approximated), regardless of its magnitude.

1(b)(iv) The Noiseless Case—Numerical Example

The following synthetic example illustrates the estimation scheme derived in the previous section for the noiseless joint geometry-radiometry estimation problem. In this example, g and h are the two-dimensional RGB images of a “flaming fractal” shown in FIG. 2. Both images are of dimensions 1188×891 and the values of each channel are in the range of [0,1].

According to the noiseless model (1.2), the observation image, h, is a version of the template image, g, subject to both geometric and radiometric deformations. The geometric deformation, an affine transformation of coordinates, was chosen to be

${A = \begin{bmatrix} 1.3 & {0.3} \\ 0.1 & {{- 1.}3} \end{bmatrix}};\mspace{14mu}{c = \begin{bmatrix} 0 \\ 0 \end{bmatrix}}$

The radiometric deformation is due to three different point-wise non-linear mappings applied to the intensities (amplitudes) of each of the channels of the template g. The mappings Q_(R), Q_(G) and Q_(B), respectively corresponding to the RGB channels of the image, were chosen to be the following polynomials Q _(R)(t)=2t−t ² ; Q _(G)(t)=2t ² −t ⁴ ; Q _(B)(t)=t

The significant difference between the template and the observation images, both geometrically and radiometrically (color-wise), is easily noticeable. Therefore, featureless methods (area methods), which directly employ the intensity information of the images in order to establish an estimate of the geometric deformation will typically fail due to the (non-negligible) intensity mismatches. On the other hand, feature based methods, which are in a sense invariant to global mappings of intensities, will have “little to grasp on” in our example due to the “fractal” nature of the images. Feature based methods may also fail in more realistic scenarios (e.g., IR images), where the advantages of using a joint radiometric-geometric featureless method are clear, as emphasized by the example above.

Given only the two images g and h, we employed the radiometry-first estimation scheme derived in 1(b)(i). According to which, the first stage is estimating the non-linear radiometric deformations Q_(R), Q_(G) and Q_(B). In order to do so, the sample distribution transformation T was applied to each of the channels of the images g and h (see 1(b)(i) for implementation remarks); the obtained distributions are shown in FIG. 3.

Corollary 2 now implies that the distributions of the template and observation are related by a deformation of coordinates, due to a right hand composition with Q_(R) ⁻¹, Q_(G) ⁻¹ and Q_(B) ⁻¹, in each channel respectively. Thus, Q_(R), Q_(G) and Q_(B) were individually estimated by applying the method proposed in 1(b)(i). The intensity mapping functions and their corresponding estimates, as obtained by the proposed algorithm, are shown in FIG. 4. The estimation errors are 2.67·10⁻⁷; 9.02·10⁻⁸ and 1.95·10⁻⁷, respectively, (evaluated as the L₂ norm of the difference between the correct and estimated functions, normalized by the integration interval).

Next, the estimated radiometric mappings were applied to the template g to obtain a radiometrically-registered version of the template relative to the observation, as shown in FIG. 5. Noticeably, the images in FIG. 5 match radiometrically (color-wise), and differ only by geometry. As in (1.13), the problem has been reduced to a strictly geometric problem.

In order to estimate the affine transformation we employed the method outlined in 1(b)(ii) above. The estimate was obtained jointly for all channels by stacking the constraints (1.18) established for each individual channel into a single over-determined system. This system was solved for the elements of A and c using a least squares solution, which yielded the following estimate

${\hat{A} = \begin{bmatrix} 1.2970 & {0.2908} \\ 0.0980 & {{- 1.}3009} \end{bmatrix}};\mspace{14mu}{\hat{c} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}}$

The error in the estimation of A is 5.77·10⁻⁵ (where we take the estimation error to be ∥I−A⁻¹Â∥₂ ²).

2. Estimation in the Presence of Noise

Recall the basic model h=Q∘g∘A+η,  (2.21) where to simplify the notations we omit the argument of the functions. In the previous section we have only discussed that case where η vanishes. In this section we discuss the noisy case, that is, where η in (2.21) is a real-valued i.i.d. random field with a known distribution function F_(η). Remark 1. An important special case of (2.21) is the additive white Gaussian noise (AWGN) model, where η is also assumed to be zero mean Gaussian with a known variance σ_(η) ².

Previously, we have shown that the sample distribution transformation, T, plays a key role in enabling an elegant solution to the joint registration problem in the noiseless case. Therefore, it is natural to study the behavior of T in the case of noisy observations.

2(a) Introduction

Let us begin with an informal discussion. The easiest way to get some intuition as to the notion of the sample distribution in the case of noisy observations is to consider for a moment the discrete case, where histograms exist. Next, we intuitively answer the following question: How does the histogram of the sum of an image and a noise process look?

Each bin of the histogram represents the relative frequency of the respective level (value) in the image. Due to the additive noise, any image sample contributing to a single bin of the image histogram, say the n-th bin, will remain at the same level with some probability p(noise=0). Similarly an image pixel originally contributing to the n−1 bin, may with probability p(noise=1) contribute to the n-th bin of the noisy image histogram. Repeating the same argument for the entire support of the noise probability function, and summing-up all the contributions, we obtain the value of the n-th bin of the noisy image. Repeating the same argument for each bin of the image histogram implies that the histogram of the noisy observation is, qualitatively, the result of convolving the image histogram with the probability function of the noise.

More formally, suppose that h takes the additive model form h(x)=g(x)+η(x), x∈supp{g},  (2.22) where g:R^(m)→R is a known deterministic function and {η(x):x∈R^(m)} is a real-valued i.i.d. random field with a known distribution function F_(η). That is, for now, we disregard the left composition Q and the right composition A.

Random fields of the type (2.22) commonly represent noisy signals over a continuous domain, where one continuously measures some continuous physical quantity; the additive random component represents the overall measurement noise, usually due to the measurement procedure.

Fields of the type (2.22) are not identically distributed; moreover, their probability distribution function is location dependent, i.e., they are not, in any sense, stationary. However, one may still expect the sample distribution of h to hold information on both the deterministic and random components. Hence, the question of determining this sample distribution is an interesting problem on its own.

Intuitively, since h is the sum of two independent components, one may expect that by employing T, we can establish a law of large numbers to yield Th=Tg*f_(η), where f_(η) is the probability density function of η. However, the transformation T may not be directly applied to a field of the type (2.22), due to inherent measurability difficulties, to be soon discussed. That being the case, the following question must first be answered:

Can the “sample distribution” of a random field of the type (2.22) be defined, such that is has analogous properties to those introduced by the transformation T?

Of course the sample distribution of h may be defined in many ways. However, we pursue a definition that preserves the properties of T, elaborately discussed in 2(c), and lets us establish a sensible law of large numbers.

Considering similar problems in the case where the domain of the random field is discrete (i.e., where x∈Z^(m)) is elementary: the case of random processes of the type (2.22) with discrete-parameter (m=1) is straightforward and the case of multi-dimensional random fields with discrete-indices (m>1) reduces to the one-dimensional case upon selection of a total order on Z^(m).

Nevertheless, there are scenarios where, due to the inherent structure of a problem, it is natural to discuss continuous-parameter multi-dimensional random fields (i.e., where x∈R^(m)). An example of such scenario is the problem of image registration we discuss. Inherently, the mapping A of R^(m) into itself is of a continuous nature, as are the physical phenomena it represents in the various problems (e.g., time warping, geometric deformation, etc.). Thus, if we impose a discrete model (e.g., x∈Z^(m)), we find that, in general, the natural A to consider is incompatible (as for “almost all” x∈Z^(m), A(x)∉Z^(m)).

However, as explained below, considering the sample distribution or, in general, laws of large numbers in the case of continuous-parameter random fields with mutually independent random variables raises severe measurability difficulties. Such i.i.d.-driven random fields are not measurable in the usual sense, and thus, the notion of sample distribution, as introduced by T, is ill-posed and has to be properly redefined. In J. L. Doob, Stochastic processes, Wiley Classics Library, John Wiley & Sons Inc., New York, 1990 (Reprint of the 1953 original), A Wiley-Interscience Publication (p. 78, 102), Doob mentioned that random processes with mutually independent random variables are too irregular to discuss in the continuous-parameter case. In a sense, sample paths of such processes are too erratic to be measurable. Indeed, in this case, the conditions of independence and joint measurability are incompatible with each other; in fact, the set of realizations whose corresponding sample-functions (sample-paths) are Lebesgue measurable is a non-measurable set; moreover, its inner and outer measures are zero and one, respectively. Furthermore, in K. L. Judd, The law of large numbers with a continuum of IID random variables, J. Econom. Theory, 35(1):19-25, 1985, Judd showed that, even if the sample-measurability problem is avoided (by a proper completion of the measure), laws of large numbers may not hold; the set of realizations where the laws of large numbers hold is again not measurable. Therefore, the Lebesgue measure offers no basis for a meaningful concept of the mean or the sample-distribution of a sample function.

Questions related to a continuum of independent and identically distributed random variables and corresponding laws of large numbers (e.g., sample-distribution) have evidently gained some interest, especially in economic theory, where various mass economic phenomena are modeled and studied. For example, in H. Uhlig, A law of large numbers for large economies, Econom. Theory, 8(1):41-50, 1996, a Riemann-like approach is invoked to integrate the sample function; then, laws of large numbers are obtained by using an L₂-norm convergence criterion. In another approach, large economies are modeled by hyperfinite processes which are measurable with respect to Loeb product spaces, and corresponding laws of large numbers are derived.

In the first three parts of this section 2 we present an approach in which the desired continuous structure of the deterministic component g is maintained while avoiding the measurability difficulties attributed to the random component η. In Part 2(c), we redefine the sample distribution transformation by “uniformly sampling” the random field h. This transformation establishes a strong law of large numbers (in the stochastic case) and reduces to T in the deterministic case. The conditions on which this transformation may be a applied to a random field of the additive model type (2.22) are discussed.

In 2(d) we determine the sample distribution of the random field h, in terms of the sample distribution of the deterministic component g and of the probability distribution of the random field η. Not surprisingly, the result we obtain resembles the known fact that the probability distribution of the sum of two independent random variables is the convolution of their distributions.

Finally, in 2(e) we apply the results to derive a solution to the foregoing registration problem in the case where the observation is subject to an additive noise. Similarly to the noiseless case, estimation of both geometric and radiometric deformation is accomplished by of solving linear systems of equations.

2(b) Distribution Transformation of a Deterministic Function

We begin by defining the three basic transformations we shall discuss.

Let {u_(i)}_(i=1) ^(∞) ⊂R^(m) be a given sequence of points in R^(m). For any function h:R^(m)→R let us define the family of transformations {T_(n) ^({u) ^(i) ^(})}_(n=1) ^(∞) by

$\begin{matrix} {{{\left\lbrack {T_{n}^{\{ u_{i}\}}h} \right\rbrack(t)} = {\frac{1}{n}\#\left\{ {{i = 1},\ldots\mspace{14mu},{n:{{h\left( u_{i} \right)} \leq t}}} \right\}}},} & (2.23) \end{matrix}$ where #A denotes the cardinality of the set A. Furthermore, whenever the limit

$\lim\limits_{n\rightarrow\infty}{\left\lbrack {T_{n}^{\{ u_{i}\}}h} \right\rbrack(t)}$ exists for all t∈R, we define T^({u) ^(i) ^(}) by

$\begin{matrix} {{T^{\{ u_{i}\}}h} = {\lim\limits_{n\rightarrow\infty}{T_{n}^{{\{ u_{i}\}}h}.}}} & (2.24) \end{matrix}$ Recall that the transformation T on B_(c)(R^(m)) has already been defined as

$\begin{matrix} {{{\lbrack{Th}\rbrack(t)} = \frac{\lambda\left\{ {{x \in {{supp}\left\{ h \right\}}}:{{h(x)} \leq t}} \right\}}{\lambda\left\{ {{supp}\left\{ h \right\}} \right\}}},\mspace{14mu}{h \in {{B_{c}\left( R^{m} \right)}.}}} & (2.25) \end{matrix}$ Notice that it also admits the following equivalent integral form

$\begin{matrix} {{{\lbrack{Th}\rbrack(t)} = {\frac{1}{\lambda\left\{ {{supp}\left\{ h \right\}} \right\}}{\int_{{supp}{\{ h\}}}{\left\lbrack {\chi_{({{- \infty},t}\rbrack} \circ h} \right\rbrack(x){\mathbb{d}{\lambda(x)}}}}}},} & (2.26) \end{matrix}$ where χ_(A) denotes the indicator function of the set A and ∘ denotes the composition of functions.

Remark 2. While the transformation T_(n) ^({u) ^(i) ^(}) is well defined for any real function, the transformation T^({u) ^(i) ^(}) is not necessarily defined for any selection of a sequence {u_(i)}_(i=1) ^(∞).

2(b)(i) Uniformly Distributed Sequences

To proceed, we introduce some basic definitions and results from the theory of uniform distribution of sequences (also known as equidistribution of sequences). For a=(a₁, . . . , a_(m)) and b=(b₁, . . . , b_(m)) in R^(m), we say that a≦b if a_(j)≦b_(j) for all j=1, . . . , m. Define the m-dimensional rectangle [a,b] as the set {x:a≦x≦b}. Using the notations 0=(0, . . . , 0) and 1=(1, . . . , 1), the rectangle [0,1] is the m-dimensional unit cube.

Definition 1. The sequence {u_(i)}_(i=1) ^(∞) ⊂[0,1] is uniformly distributed in [0,1] with respect to the Lebesgue measure λ (abbreviated λ-u.d.) if

${\lim\limits_{n->\infty}{\frac{1}{n}\#\left\{ {{i = 1},\ldots\mspace{14mu},{n:{u_{i} \in \left\lbrack {a,b} \right\rbrack}}} \right\}}} = {{\lambda\left\{ \left\lbrack {a,b} \right\rbrack \right\}} = {\prod\limits_{i = 1}^{n}\left( {b_{i} - a_{i}} \right)}}$ for all [a,b]⊂[0,1]. That is, in simple terms, the proportion of terms falling in any sub-rectangle is proportional to its volume.

Remark 3. Many constructive examples of λ-u.d. sequences in [0,1]⊂R exist. In fact, u.d. sequences are natural in the sense that a sequence of realizations of a uniformly distributed random variable is almost surely a lambda-u.d. sequence (an immediate result of the strong law of large numbers). A generalization of the construction of u.d. sequences to [0,1]⊂R^(m) is straightforward.

The following characterization of λ-u.d. sequences is given in L. Kuipers and H. Niederreiter, Uniform distribution of sequences, Wiley-Interscience [John Wiley & Sons], New York, 1974, a sequence {u_(i)}_(i=1) ^(∞) is λ-u.d. in [0,1] if and only if for every Riemann integrable function h on [0,1]

${\lim\limits_{n->\infty}{\frac{1}{n}{\sum\limits_{i = 1}^{n}{h\left( u_{i} \right)}}}} = {\int_{\lbrack{0,1}\rbrack}{{h(x)}{{\mathbb{d}{\lambda(x)}}.}}}$

Remark 4. This characterization cannot be generalized to Lebesgue measurable functions since, in general, the Lebesgue integral cannot be determined by the values of a function on any countable set of points.

We would like to expand the notion of λ-u.d. sequences to non-rectangular subsets of R^(m). In order to do so, let us briefly introduce the Jordan measure through the following characterization. Let A⊂R^(m) be a bounded set; the following are equivalent:

-   -   1. A is Jordan measurable.     -   2. χ^(A), the indicator function of A, is Riemann integrable.     -   3. λ{∂A}=0, that is, the boundary of A is of Lebesgue measure         zero.

Whenever a set is Jordan measurable, its Jordan measure (also called Jordan content) is exactly its Lebesgue measure. It should be noted that the Jordan measure is a weak notion of measure, since it is simply the restriction of the Lebesgue measure to the ring of bounded Lebesgue measurable sets having boundary of measure zero. Nevertheless, the Riemann integral can be defined in terms of Jordan measure in about the same way that the Lebesgue integral is defined in terms of Lebesgue measure. Therefore, since λ-u.d. sequences are characterized in terms of Riemann integrable functions, the natural non-rectangular subsets of R^(m) to consider in this context are Jordan measurable sets.

Throughout, whenever we let U⊂R^(m) be a compact, Jordan measurable subset of R^(m), we also assume it is of a positive measure.

Definition 2. Let U⊂R^(m) be a compact, Jordan measurable subset of R^(m). A sequence {u_(i)}_(i=1) ^(∞) ⊂U is λ-u.d. in U if

${\lim\limits_{n->\infty}{\frac{1}{n}{\sum\limits_{i = 1}^{n}{h\left( u_{i} \right)}}}} = {\frac{1}{\lambda\left\{ U \right\}}{\int_{U}{{h(x)}{\mathbb{d}{\lambda(x)}}}}}$ for every Riemann integrable function h with supp{h}⊂U.

Remark 5. By using Definition 2, it is easy to see that the λ-u.d. property of a sequence is preserved under non-singular affine transformations: let A be a non-singular affine transformation of R^(m); {u_(i)}_(i=1) ^(∞) is λ-u.d. in U if and only if {A(u_(i))}_(i=1) ^(∞) is λ-u.d. in A(U).

Indeed, suppose {A(u_(i))}_(i=1) ^(∞) is λ-u.d. in A(U). Let h be a Riemann integrable function with supp{h}⊂U. Obviously, (h∘A⁻¹) is a Riemann integrable function whose support is contained in A(U). By applying the λ-u.d. property of {A(u_(i))}_(i=1) ^(∞) and substituting y=A⁻¹(x), we find that

$\begin{matrix} {{\lim\limits_{n->\infty}{\frac{1}{n}{\sum\limits_{i = 1}^{n}{h\left( u_{i} \right)}}}} = {\lim\limits_{n->\infty}{\frac{1}{n}{\sum\limits_{i = 1}^{n}{h\left( {A^{- 1}\left( {A\left( u_{i} \right)} \right)} \right)}}}}} \\ {= {\frac{1}{\lambda\left\{ {A(U)} \right\}}{\int_{A{(U)}}{{h\left( {A^{- 1}(x)} \right)}{\mathbb{d}{\lambda(x)}}}}}} \\ {{= {{\frac{1}{{A}\lambda\left\{ U \right\}}{\int_{U}{{h(y)}{A}{\mathbb{d}{\lambda(y)}}}}} = {\frac{1}{\lambda\left\{ U \right\}}{\int_{U}{{h(y)}{\mathbb{d}{\lambda(y)}}}}}}},} \end{matrix}$ and therefore {u_(i)}_(i=1) ^(∞) is λ-u.d. in U.

To complete the definition of λ-u.d. sequences in non-rectangular subsets of R^(m), we must validate that such sequences exist, as the next lemma asserts.

Lemma 1. Let U⊂R^(m) be a compact, Jordan measurable subset of R^(m). There exists a λ-u.d. sequence {u_(i)}_(i=1) ^(∞) in U.

Proof. Without loss of generality we assume that U⊂[0,1]; otherwise, choose A to be a non-singular affine transformation of R^(m) such that A(U)⊂[0,1] and use Remark 5.

Let {u_(i)}_(i=1) ^(∞) be a λ-u.d. sequence in [0,1]. Define the subsequence {i_(k)}_(k=1) ^(∞) recursively: i₁=min{i≧1:u_(i)∈U} and i_(k)=min{i>i_(k−1):u_(i)∈U}, k≧2. That is, {i_(k)}_(k=1) ^(∞) is the maximal strictly increasing subsequence such that u_(i) _(k) ∈U for all k. Notice that since U is of positive measure, i_(k) is finite for every k, and thus, {i_(k)}_(k=1) ^(∞) is well defined. We will prove that the subsequence {u_(i) _(k) }_(k=1) ^(∞) is λ-u.d. in U. Let h be a Riemann integrable function with supp{h}⊂U. Since supp{h}⊂U, we have h(u_(i))=0 for i∉{i_(k)}_(k=1) ^(∞), hence

$\begin{matrix} {{\frac{1}{n}{\sum\limits_{k = 1}^{n}{h\left( u_{i_{k}} \right)}}} = {{\frac{1}{n}{\sum\limits_{i = 1}^{i_{n}}{h\left( u_{i} \right)}}} = {{\frac{i_{n}}{n} \cdot \frac{1}{i_{n}}}{\sum\limits_{i = 1}^{i_{n}}{h\left( u_{i} \right)}}}}} & (2.27) \end{matrix}$ for all n. By the construction of {i_(k)}_(k=1) ^(∞), exactly n of the first i_(n) elements of {u_(i) _(k) }_(k=1) ^(∞) belong to U. Hence, with χ_(U) denoting the characteristic function of U, for all n we have

$n = {\sum\limits_{i = 1}^{i_{n}}{{\chi_{U}\left( u_{i} \right)}.}}$ Notice that n≦i_(n) and thus n→∞ implies i_(n)→∞. Since U is Jordan measurable, the function χ_(U) is Riemann integrable so that we can use the λ-u.d. property of {u_(i)}_(i=1) ^(∞) in [0,1] to obtain

${\lim\limits_{n->\infty}\frac{n}{i_{n}}} = {{\lim\limits_{n->\infty}{\frac{1}{i_{n}}{\sum\limits_{i = 1}^{i_{n}}{\chi_{U}\left( u_{i} \right)}}}} = {{\int_{\lbrack{0,1}\rbrack}{{\chi_{U}(x)}{\mathbb{d}{\lambda(x)}}}} = {\lambda{\left\{ U \right\}.}}}}$ Using the same property of {u_(i)}_(i=1) ^(∞) again, we obtain

${\lim\limits_{n->\infty}{\frac{1}{i_{n}}{\sum\limits_{i = 1}^{i_{n}}{h\left( u_{i} \right)}}}} = {{\int_{\lbrack{0,1}\rbrack}{{h(x)}{\mathbb{d}{\lambda(x)}}}} = {\int_{U}{{h(x)}{{\mathbb{d}{\lambda(x)}}.}}}}$ Substituting into (2.27) yields

$\begin{matrix} {{\lim\limits_{n->\infty}{\frac{1}{n}{\sum\limits_{k = 1}^{n}{h\left( u_{i_{k}} \right)}}}} = {{\lim\limits_{n->\infty}{{\frac{i_{n}}{n} \cdot \frac{1}{i_{n}}}{\sum\limits_{i = 1}^{i_{n}}{h\left( u_{i} \right)}}}} = {\frac{1}{\lambda\left\{ U \right\}}{\int_{U}{{h(x)}{{\mathbb{d}{\lambda(x)}}.}}}}}} & (2.28) \end{matrix}$ Since (2.28) holds for any Riemann integrable function h with supp{h}⊂U, the sequence {u_(i) _(k) }_(k=1) ^(∞) is λ-u.d. in U. 2(b)(ii) On the Transformation T^({u) ^(i) ^(})

Next, we elaborate on the relationship between the transformation T and the transformation T^({u) ^(i) ^(}), defined in (2.24). In order to do so, we restrict the discussion to a better behaved class of functions.

Given a function h, define L_(h)(t)={x∈supp{h}:h(x)≦t}. Denote R _(c)(R ^(m))={h∈B _(c)(R ^(m)):h is Riemann integrable and L _(h)(t) is Jordanmeasurable for all t}. That is, R_(c)(R^(m)) is the subset of B_(c)(R^(m)) of Riemann integrable functions that also have Jordan measurable sub-level sets, restricted to its support.

It should be noted that the additional requirement that L_(h)(t) is Jordan measurable for all t is not very strong. In [22] it is shown that given a Riemann integrable function h, for all except at most a countable number values of t, the subsets L_(h)(t) are Jordan measurable. That, in turn, implies that if L_(h)(t₀) is not Jordan measurable for some t₀ then, for arbitrarily small ε>0, the set {x∈supp{h}:t₀−ε<h(x)≦t₀} has a boundary of a positive measure. Hence, Riemann integrable functions that do not comply with the above requirement are, roughly speaking, irregular.

Moreover, from an applied point of view, restricting the discussion to R_(c)(R^(m)) imposes no significant practical limitations being “rich” enough to describe any sampled physical signal.

Lemma 2. Let U⊂R^(m) be a compact, Jordan measurable subset of R^(m) and {u_(i)}_(i=1) ^(∞) be a λ-u.d. sequence in U. For all h∈R_(c)(R^(m)) with supp{h}=U we have Th=T ^({u) ^(i) ^(}) h.  (2.29) If, in addition, h assumes only finitely many values, then for all t we have

$\begin{matrix} {\frac{\lambda\left\{ {{x \in {U:{h(x)}}} = t} \right\}}{\lambda\left\{ U \right\}} = {\lim\limits_{n->\infty}{\frac{1}{n}\#{\left\{ {{i = 1},\ldots\mspace{14mu},{{n:{h\left( u_{i} \right)}} = t}} \right\}.}}}} & (2.30) \end{matrix}$ Proof. Since h∈R_(c)(R^(m)), the set L_(h)(t) is Jordan measurable for all t. Equivalently, the function χ_((−∞,t])∘h is Riemann integrable on U for all t, as χ_((−∞,t])∘h=χL ^(h) _((t)) on U. Therefore, the bda-u.d. property of the sequence {u_(i)}_(i=1) ^(∞) may be applied to obtain

${\lbrack{Th}\rbrack(t)} = {{\frac{1}{\lambda\left\{ U \right\}}{\int_{U}{\left\lbrack {\chi_{({{- \infty},t}\rbrack} \circ h} \right\rbrack(x){\mathbb{d}{\lambda(x)}}}}} = {{\lim\limits_{n->\infty}{\frac{1}{n}{\sum\limits_{i = 1}^{n}{\left\lbrack {\chi_{({{- \infty},t}\rbrack} \circ h} \right\rbrack\left( u_{i} \right)}}}} = {{\lim\limits_{n->\infty}{\frac{1}{n}\#\left\{ {{i = 1},\ldots\mspace{14mu},{n:{{h\left( u_{i} \right)} \leq t}}} \right\}}} = {\left\lbrack {T^{\{ u_{i}\}}h} \right\rbrack{(t).}}}}}$ Hence, the first part of the claim is proved. Denote by {ν₁<ν₂< . . . <ν_(R) } the values h assumes under the finite range assumption. Obviously, (2.30) holds for t∉{ν₁, ν₂, . . . , ν_(R)}. Using (2.29), for t=ν_(r), r=1, . . . , R, we find that

$\frac{\lambda\left\{ {{x \in {U:{h(x)}}} = v_{r}} \right\}}{\lambda\left\{ U \right\}} = {{{\lbrack{Th}\rbrack\left( v_{r} \right)} - {\lbrack{Th}\rbrack\left( \frac{v_{r} + v_{r - 1}}{2} \right)}} = {\quad{{{{\left\lbrack {T^{\{ u_{i}\}}h} \right\rbrack\left( v_{r} \right)} - {\left\lbrack {T^{\{ u_{i}\}}h} \right\rbrack\left( \frac{v_{r} + v_{r - 1}}{2} \right)}} = {\lim\limits_{n->\infty}{\frac{1}{n}\#\left\{ {{i = 1},\ldots\mspace{14mu},{{n:{h\left( u_{i} \right)}} = v_{r}}} \right\}}}},}}}$ where ν₀ is arbitrarily set to be less than ν₁, which completes the proof.

Thus, for a proper selection of {u_(i)}_(i=1) ^(∞), the transformation T can be calculated by means of {T_(n) ^({u) ^(i) ^(})}_(n=1) ^(∞) on the well-behaved class of functions R_(c)(R^(m)).

We conclude this section with the following simple lemmas:

Lemma 3. If h, h∈B_(c)(R^(m)) such that h≦ h then T_(n) ^({u) ^(i) ^(}) h≦T_(n) ^({u) ^(i) ^(}) h for all n∈N.

Proof. Simply follows by noticing that {x: h(x)≦t}⊂{x:h(x)≦t}.

Lemma 4. If Q:R→R is a continuous strictly increasing function such that Q(0)=0 and A:R^(m)→R^(m) is a non-singular affine transformation, then Q∘g∘A∈R _(c)(R ^(m)), for all g∈R _(c)(R ^(m)). Proof. Let g∈R_(c)(R^(m)). Since A is non-singular affine, g∘A is bounded and compactly supported. As g is Riemann integrable, its set of discontinuities has measure zero. Since {x:g∘A is continuous at x}=A⁻¹({x:g is continuous at x}), and thus g∘A we conclude that the set of discontinuities of g∘A is of measure zero as well. Therefore, g∘A is Riemann integrable. The set L_(g∘A)(t)=A⁻¹(L_(g)(t)) is Jordan measurable for all t∈R, from which we conclude that g∘A∈R_(c)(R^(m)). On the other hand, Q is a continuous strictly increasing on R and thus Q∘g is bounded. Q∘g is compactly supported since Q(0)=0. Since Q is continuous, we have that {x:Q∘g is continuous at x}={x:g is continuous at x} and thus, as g is Riemann integrable, so is Q∘g. Finally, L_(Q∘g)(t)=L_(g)(Q⁻¹(t)), and thus Q∘g∈R_(c)(R^(m)). 2(c) Distribution Transformations of Random Fields

So far, we have discussed the properties of a family of distribution transformations when applied to deterministic functions. We shall now discuss the results of applying the transformations T_(n) ^({u) ^(i) ^(}) and T^({u) ^(i) ^(}) to a random field.

Let {η(x):x∈R^(m)} be a real-valued i.i.d. random field on (Ω,F,P) with a known probability distribution function F_(η). Let {u_(i)}_(i=1) ^(∞) be a given sequence of distinct points in R^(m). The transformation T_(n) ^({u) ^(i) ^(}) can now be applied to η. Put:

${F^{(n)}(t)} = {\quad{{\left\lbrack {T_{n}^{\{ u_{i}\}}\eta} \right\rbrack(t)} = {\frac{1}{n}\#{\left\{ {{i = 1},\ldots\mspace{14mu},{n:{{\eta\left( u_{i} \right)} \leq t}}} \right\}.}}}}$ F^((n)) is known as the empirical distribution function of {η(u_(i))}_(i=1) ^(n). For fixed t, F^((n))(t) is a random variable (of the implicit variable ω). For a realization of the random field (i.e., fixed ω) the function F^((n))(t) is a distribution function as it is an increasing step function jumping by 1/n at each point η(u_(i)).

In this context, the Glivenko-Cantelli theorem can be rephrased to state that:

${{\lim\limits_{n->\infty}{F^{(n)}(t)}} = {{F_{\eta}(t)}\mspace{14mu}{a.s.}}},$ uniformly in t, that is,

${\lim\limits_{n\rightarrow\infty}{{F^{(n)} - F_{\eta}}}_{\infty}} = 0$ with probability 1. Therefore, in terms of the transformations we have previously defined, T^({u) ^(i) ^(})η=lim_(n→∞)T_(n) ^({u) ^(i) ^(})η=F_(η) with probability 1. Hence, for any sequence of distinct points {u_(i)}_(i=1) ^(∞) ⊂R^(m) the transformation T^({u) ^(i) ^(}) is a strongly consistent non-parametric estimator for the probability distribution function of the random field {η(x):x∈R^(m)}.

We conclude this section with the following lemma:

Lemma 5. Let S be an infinite subset of {u_(i):i=1,2, . . . }. Then

${{\lim\limits_{n\rightarrow\infty}\frac{\#\;\left\{ {{i = 1},\ldots\mspace{14mu},{{n\text{:}u_{i}} \in {{S\mspace{20mu}{and}\mspace{20mu}{\eta\left( u_{i} \right)}} \leq t}}} \right\}}{\#\;\left\{ {{i = 1},\ldots\mspace{14mu},{{n\text{:}u_{i}} \in S}} \right\}}} = {F_{\eta}(t)}}\mspace{14mu}$ a.s., uniformly in t. Proof. Let

${F_{n}^{S}(t)} = \frac{\#\;\left\{ {{i = \text{1}},\ldots\mspace{14mu},{{n\text{:}u_{i}} \in {{S\mspace{20mu}{and}\mspace{20mu}{\eta\left( u_{i} \right)}} \leq t}}} \right\}}{\#\;\left\{ {{i = 1},\ldots\mspace{14mu},{{n\text{:}u_{i}} \in S}} \right\}}$ Given the subset S , let {i_(k)}_(k=1) ^(∞) be a strictly increasing subsequence of indices such that S={u_(i) _(k) }_(k=1) ^(∞). We then have

$\begin{matrix} {{F_{i_{k}}^{S}(t)} = \frac{\#\;\left\{ {{i = 1},\ldots\mspace{14mu},{{i_{k}\text{:}u_{i}} \in {{S\mspace{20mu}{and}\mspace{20mu}{\eta\left( u_{i} \right)}} \leq t}}} \right\}}{\#\;\left\{ {{i = 1},\ldots\mspace{14mu},{{i_{k}\text{:}u_{i}} \in S}} \right\}}} \\ {= {\frac{\#\;\left\{ {{j = 1},\ldots\mspace{14mu},{{k\text{:}{\eta\left( u_{i_{j}} \right)}} \leq t}} \right\}}{k}.}} \end{matrix}$ Now, by the Glivenko-Cantelli theorem we conclude that lim_(k→∞)F_(i) _(k) ^(S)(t)=F_(η)(t) a.s., uniformly in t. In particular, {F_(n) ^(S)(t)}_(n=1) ^(∞) has a convergent subsequence. Furthermore, since {i_(k)} is strictly increasing, u_(i) _(k) ₊₁,u_(i) _(k) ₊₂, . . . , u_(i) _((k+1)) ⁻¹∉S. Thus, F_(n) ^(S)(t) is constant for n=i_(k),i_(k)+1, . . . , i_(k+1)−1, that is, F_(n) ^(S)(t)=F_(i) _(k) ^(S)(t) for i_(k)≦n<i_(k+1). Since |F _(n) ^(S)(t)−F _(η)(t)|≦|F _(n) ^(S)(t)−F _(i) _(k) ^(S)(t)|+|F _(i) _(k) ^(S)(t)−F _(η)(t)| the choice i_(k)≦n<i_(k+1) eliminates |F_(n) ^(S)(t)−F_(i) _(k) ^(S)(t)|, hence we have

${\lim\limits_{n\rightarrow\infty}{F_{n}^{S}(t)}} = {{\lim\limits_{k\rightarrow\infty}{F_{i_{k}}^{S}(t)}} = {F_{\eta}(t)}}$ a.s., uniformly in t. 2(d) Distribution Transformation of the Additive Model

In this section, we return to discuss the additive model. Suppose that h takes the form h(x)=g(x)+η(x), x∈supp{g},  (2.31) where g∈R_(c)(R^(m)) is a deterministic function and {η(x):x∈R^(m)} is a real-valued i.i.d. random field with distribution function F_(η). Let U=supp{g } and {u_(i)}_(i=1) ^(∞) be a λ-u.d. sequence of distinct points in U (such exist, according to Lemma 1). Proposition 6. If g assumes only finitely many values {ν₁, . . . , ν_(R)}, then

${\left\lbrack {T^{\{ u_{i}\}}h} \right\rbrack(t)} = {\sum\limits_{r = 1}^{R}\;{\frac{\lambda\left\{ {{x \in {U\text{:}{g(x)}}} = v_{r}} \right\}}{\lambda\left\{ U \right\}}{F_{\eta}\left( {t - v_{r}} \right)}\mspace{14mu}{a.s.}}}$ Proof. By the definition of T_(n) ^({u) ^(i) ^(})

$\begin{matrix} {{\left\lbrack {T_{n}^{\{ u_{i}\}}h} \right\rbrack(t)} = {\frac{1}{n}\#\left\{ {{i = 1},\ldots\mspace{14mu},\;{{{n\text{:}{g\left( u_{i} \right)}} + {\eta\left( u_{i} \right)}} \leq t}} \right\}}} & (2.32) \end{matrix}$ for all n and all t. Since g assumes finitely many values, the right-hand side of (2.32) decomposes into a finite sum:

$\begin{matrix} {{\frac{1}{n}\#\left\{ {{i = 1},\ldots\mspace{14mu},\;{{{n\text{:}{g\left( u_{i} \right)}} + {\eta\left( u_{i} \right)}} \leq t}} \right\}} = {\sum\limits_{r = 1}^{R}\;{\frac{1}{n}\#{\left\{ {{i = 1},\ldots\mspace{14mu},\;{{n\text{:}{g\left( u_{i} \right)}} = {{v_{r}\mspace{20mu}{and}\mspace{20mu}\eta\left( u_{i} \right)} \leq {t - v_{r}}}}} \right\}.}}}} & (2.33) \end{matrix}$ Without loss of generality, we may assume there exists an n₀ such that the sets {i=1, . . . , n₀:g(u_(i))=ν_(r)} are non-empty for r=1, . . . , R; otherwise, the empty terms in (2.33) may be excluded. Hence, for n≧n₀, each term of the sum on the right-hand side of (2.33) may be written as a product of two factors

${{\frac{1}{n}\#\left\{ {{i = 1},\ldots\mspace{14mu},\;{{n\text{:}{g\left( u_{i} \right)}} = {{v_{r}\mspace{20mu}{and}\mspace{14mu}{\eta\left( u_{i} \right)}} \leq {t - v_{r}}}}} \right\}} = {D_{n}^{(r)} \cdot {F_{n}^{(r)}\left( {t - v_{r}} \right)}}},$ where we denote

$D_{n}^{(r)} = \frac{{\#\left\{ {{i = 1},\ldots\mspace{14mu},\;{{n\text{:}{g\left( u_{i} \right)}} = v_{r}}} \right\}}\mspace{11mu}}{n}$ ${F_{n}^{(r)}(t)} = {\frac{\#\;\left\{ {{i = 1},\ldots\mspace{11mu},{{n\text{:}{g\left( u_{i} \right)}} = {{v_{r}\mspace{25mu}{and}\mspace{20mu}{\eta\left( u_{i} \right)}} \leq t}}} \right\}}{\#\;\left\{ {{i = 1},\ldots\mspace{14mu},{{n\text{:}{g\left( u_{i} \right)}} = v_{r}}} \right\}}.}$ Notice that {D_(n) ^((r))}_(n=1) ^(∞) is a deterministic sequence, while {F_(n) ^((r))(•)}_(n=1) ^(∞) is a sequence of random processes. With these notations,

${\left\lbrack {T_{n}^{\{ u_{i}\}}h} \right\rbrack(t)} = {\sum\limits_{r = 1}^{R}{D_{n}^{(r)} \cdot {F_{n}^{(r)}\left( {t - v_{r}} \right)}}}$ Now, since the conditions of Lemma 2 are satisfied,

${\lim\limits_{n\rightarrow\infty}D_{n}^{(r)}} = {{\lim\limits_{n\rightarrow\infty}\frac{{\#\left\{ {{i = 1},\ldots\mspace{14mu},\;{{n\text{:}{g\left( u_{i} \right)}} = v_{r}}}\; \right\}}\;}{n}} = \frac{\lambda\left\{ {{x \in {U\text{:}{g(x)}}} = v_{r}} \right\}}{\lambda\left\{ U \right\}}}$ for r=1, . . . , R. Denote the limit D^((r))=lim_(n→∞)D_(n) ^((r)). Further notice that since the discrete-parameter random process {η(u_(i)):i=1,2, . . . } satisfies the conditions of the Glivenko-Cantelli theorem, Lemma 5 may be used to obtain

${\lim\limits_{n\rightarrow\infty}{F_{n}^{(r)}(t)}} = {{F_{\eta}(t)}\mspace{20mu}{a.s.}}$ uniformly in t. Finally, since with probability 1 we have ∥D _(n) ^((r)) F _(n) ^((r)) −D ^((r)) F _(η)∥_(∞) =∥D _(n) ^((r)) F _(n) ^((r)) −D ^((r)) F _(n) ^((r)) +D ^((r)) F _(n) ^((r)) −D ^((r)) F _(η)∥_(∞) ≦|D _(n) ^((r)) −D ^((r)) |∥F _(n) ^((r))∥_(∞) +D ^((r)) ∥F _(n) ^((r)) −F _(η)∥_(∞), the limit lim_(n→∞){D_(n) ^((r))·F_(n) ^((r))(t−ν_(r))} exists almost surely for all r, and we find that

$\begin{matrix} {{\left\lbrack {T^{\{ u_{i}\}}h} \right\rbrack(t)} = {{\lim\limits_{n\rightarrow\infty}{\left\lbrack {T_{n}^{\{ u_{i}\}}h} \right\rbrack(t)}} = {{\sum\limits_{r = 1}^{R}\;{\lim\limits_{n\rightarrow\infty}\left\{ {D_{n}^{(r)} \cdot {F_{n}^{(r)}\left( {t - v_{r}} \right)}} \right\}}} = {\sum\limits_{r = 1}^{R}{\frac{\lambda\left\{ {{x \in {U\text{:}{g(x)}}} = v_{r}} \right\}}{\lambda\left\{ U \right\}}{F_{\eta}\left( {t - v_{r}} \right)}\mspace{14mu}{a.s.}}}}}} & (2.34) \end{matrix}$ which concludes the proof.

In the special case, where the random field {η(x):x∈R^(m)} has an absolutely continuous probability distribution, we have the following result:

Theorem 7. Let f_(η) be the probability density function of the random field {η(x):x∈R^(m)}. Then, the limit

$\lim\limits_{n\rightarrow\infty}{T_{n}^{\{ u_{i}\}}h}$ exists, and T ^({u) ^(i) ^(}) h=[T ^({u) ^(i) ^(}) g]*f _(η) a.s. Furthermore, this equality also holds in L_(p)(Ω,P)-norm, 1≦p<∞. Proof. We split the proof into two steps. First, we prove the assertion for g∈R_(c)(R^(m)) that only assumes finitely many values. We then extend the result to an arbitrary g∈R_(c)(R^(m)). We shall use the same notations as in the proof of Lemma 2.6. Notice that F_(η) can be represented as the convolution of f_(η) with the unit step function χ_([0,∞)),

F_(η)(t) = ∫_(−∞)^(t)f_(η)(τ)𝕕τ = ∫_(−∞)^(∞)f_(η)(τ)χ_([0, ∞))(t − τ)𝕕τ for all t∈R. Substituting into (2.34) yields

$\begin{matrix} {{\left\lbrack {T^{\{ u_{i}\}}h} \right\rbrack(t)} = {\sum\limits_{r = 1}^{R}{\frac{\lambda\left\{ {{x \in {U:{g(x)}}} = v_{r}} \right\}}{\lambda\left\{ U \right\}}{\int_{- \infty}^{\infty}{{f_{\eta}(\tau)}{\chi_{\lbrack{0,\infty})}\left( {t - v_{r} - \tau} \right)}{\mathbb{d}\tau}}}}}} & (2.35) \end{matrix}$ with probability 1. Since χ_([0,∞))(t−ν_(r)−τ)=χ_([ν) _(r) _(,∞))(t−τ), we have

$\begin{matrix} {{\left\lbrack {T^{\{ u_{i}\}}h} \right\rbrack(t)} = {\int_{- \infty}^{\infty}{{{f_{\eta}(\tau)}\left\lbrack {\sum\limits_{r = 1}^{R}{\frac{\lambda\left\{ {{x \in {U:{g(x)}}} = v_{r}} \right\}}{\lambda\left\{ U \right\}}\chi_{\lbrack{v_{r},\infty})}}} \right\rbrack}\left( {t - \tau} \right){{\mathbb{d}\tau}.}}}} & (2.36) \end{matrix}$ Clearly,

$\begin{matrix} {{\left\lbrack {\sum\limits_{r = 1}^{R}{\frac{\lambda\left\{ {{x \in {U:{g(x)}}} = v_{r}} \right\}}{\lambda\left\{ U \right\}}\chi_{\lbrack{v_{r},\infty})}}} \right\rbrack(t)} = {{\sum\limits_{\{{v_{r} \leq t}\}}\frac{\lambda\left\{ {{x \in {U:{g(x)}}} = v_{r}} \right\}}{\lambda\left\{ U \right\}}} = {\frac{\lambda\left\{ {x \in {U:{{g(x)} \leq t}}} \right\}}{\lambda\left\{ U \right\}} = {\lbrack{Tg}\rbrack{(t).}}}}} & (2.37) \end{matrix}$ Substituting (2.37) into (2.36), we obtain

$\begin{matrix} {{\left\lbrack {T^{\{ u_{i}\}}h} \right\rbrack(t)} = {{\int_{- \infty}^{\infty}{{{f_{\eta}(\tau)}\lbrack{Tg}\rbrack}\left( {t - \tau} \right){\mathbb{d}\tau}}} = {\left\lbrack {\lbrack{Tg}\rbrack*f_{\eta}} \right\rbrack(t)\mspace{14mu}\text{a.s.}}}} & (2.38) \end{matrix}$ Finally, since {u_(i)}_(i=1) ^(∞) is a λ-u.d. sequence in U, Lemma 2 implies that Tg=T^({u) ^(i) ^(})g, and therefore [T ^({u) ^(i) ^(}) h](t)=([T ^({u) ^(i) ^(}) g]*f _(η))(t) a.s.  (2.39) Thus, the assertion is proved, given that g∈R_(c)(R_(m)) is also a simple function, that is, g only assumes finitely many values.

Next, we extend this result to an arbitrary g∈R_(c)(R^(m)) by means of approximation from below and from above.

Let g _(k)=└kg┘/k, k≧1. It is easy to see that { g _(k)} is a sequence of simple functions in R_(c)(R^(m)) such that g _(k)≦g and g _(k)→g pointwise. Importantly, this also implies that (χ_((−∞,t]) ∘ g _(k))→(χ_((−∞,t]) ∘g)  (2.40) pointwise, for all t. This important property is simply due to the left continuity of χ_((−∞,t]) and the fact that g _(k)≦g.

Similarly, let {tilde over (g)} _(k)=┌kg┐/k, k≧1. Then, {{tilde over (g)} _(k)} is a sequence of simple functions in R_(c)(R^(m)) such that g≦{tilde over (g)} _(k) and {tilde over (g)} _(k)→g pointwise. In this case, however, a property similar to (2.40) is not guaranteed. Namely, fix t∈R, x∈R^(m), and examine [χ_((−∞,t])∘{tilde over (g)} _(k)](x) as k→∞; three possible cases arise: (i) if g(x)>t then [χ_((−∞,t])∘{tilde over (g)} _(k)](x)=[χ_((−∞,t])∘g](x)=0 for all k; (ii) if g(x)<t then there exists some k₀∈N such that [χ_((−∞,t])∘e{tilde over (g)}_(k)](x)=[χ_((−∞,t])∘g](x)=1 for all k>k₀; (iii) if g(x)=t then [χ_((−∞,t])∘{tilde over (g)} _(k)](x)≦[χ_((−∞,t])∘g](x)=1 for all k. Hence, a problem may occur for values of t such that {x:g(x)=t} has a positive measure. This problem, however, is simple to rectify since {x:g(x)=t} has zero measure for all except at most a countable number values of t. Let {t_(k)} be the values of t for which {x:g(x)=t} has a positive measure, and define

${{\underset{\_}{g}}_{k}(x)} = \left\{ \begin{matrix} {{g(x)},} & {{{g(x)} = t_{j}},{j = 1},\ldots\mspace{14mu},k} \\ {{{\overset{\sim}{\underset{\_}{g}}}_{k}(x)},} & {{otherwise}.} \end{matrix} \right.$ Clearly, {g _(k)} is a sequence of simple functions in R_(c)(R^(m)) such that g≦g _(k) and g _(k)→g pointwise. Moreover, (χ_((−∞,t]) ∘g _(k))→(χ_((−∞,t]) ∘g) a.e.  (2.41) for all t. Recall that h=g+η, and similarly denote h _(k) =g _(k) +η, h _(k) = g _(k)+η. Since g _(k) and g _(k) assume only finitely many values, (2.39) implies that there exist subsets Ω ₀ ^((k)), Ω ₀ ^((k)) ⊂Ω, k=1,2, . . . , of measure one such that T ^({u) ^(i) ^(}) h _(k) =[T ^({u) ^(i) ^(}) g _(k) ]*f _(η) , onΩ ₀ ^((k))  (2.42) and T ^({u) ^(i) ^(}) h _(k) =[T ^({u) ^(i) ^(}) g _(k) ]*f _(η) , on Ω ₀ ^((k)).  (2.43) Denote

${\Omega_{0} = {\bigcap\limits_{k}\left( {{\underset{\_}{\Omega}}_{0}^{(k)}\bigcap{\overset{\_}{\Omega}}_{0}^{(k)}} \right)}},$ so that, Ω₀ is again of measure one, being a countable intersection of sets of measure one. Now, fix ω∈Ω₀ (i.e., fix a realization of η). Since g _(k)≦g≦g _(k), we also have h _(k)≦h≦h _(k). According to Lemma 3, for all n∈N we have T _(n) ^({u) ^(i) ^(}) h _(k) ≦T _(n) ^({u) ^(i) ^(}) h≦T _(n) ^({u) ^(i) ^(}) h _(k). Taking n→∞ gives, for every k,

$\begin{matrix} {{T^{\{ u_{i}\}}{\underset{\_}{h}}_{k}} = {{{\lim\limits_{n->\infty}\;{T_{n}^{\{ u_{i}\}}{\underset{\_}{h}}_{k}}} \leq {\underset{n->\infty}{\lim\;\inf}T_{n}^{\{ u_{i}\}}h} \leq {\underset{n->\infty}{\lim\;\sup}T_{n}^{\{ u_{i}\}}h} \leq {\lim\limits_{n->\infty}{T_{n}^{\{ u_{i}\}}{\overset{\_}{h}}_{k}}}} = {T^{\{ u_{i}\}}{{\overset{\_}{h}}_{k}.}}}} & (2.44) \end{matrix}$ We shall show that T^({u) ^(i) ^(}) h _(k) and T^({u) ^(i) ^(}) h _(k) tend to the same limit as k→∞. Notice that, by using Lemma 2 and the integral form of T, we have

${\left\lbrack {T^{\{ u_{i}\}}g} \right\rbrack(t)} = {{\lbrack{Tg}\rbrack(t)} = {\frac{1}{\lambda\left\{ U \right\}}{\int_{U}{\left\lbrack {\chi_{({{- \infty},t}\rbrack} \circ g} \right\rbrack(x){\mathbb{d}{\lambda(x)}}}}}}$ ${{{and}\left\lbrack {T^{\{ u_{i}\}}{\underset{\_}{g}}_{k}} \right\rbrack}(t)} = {{\left\lbrack {T{\underset{\_}{g}}_{k}} \right\rbrack(t)} = {\frac{1}{\lambda\left\{ U \right\}}{\int_{U}{\left\lbrack {\chi_{({{- \infty},t}\rbrack} \circ {\underset{\_}{g}}_{k}} \right\rbrack(x){{\mathbb{d}{\lambda(x)}}.}}}}}$ Recall that {g _(k)} satisfies (χ_((−∞,t])∘g _(k))→(χ_((−∞,t])∘g) a.e. for all t. Hence, Lebesgue's bounded convergence theorem may be employed to show that

${\lim\limits_{k->\infty}{\left\lbrack {T^{\{ u_{i}\}}{\underset{\_}{g}}_{k}} \right\rbrack(t)}} = {{\lim\limits_{k->\infty}{\frac{1}{\lambda\left\{ U \right\}}{\int_{U}{\left\lbrack {\chi_{({{- \infty},t}\rbrack} \circ {\underset{\_}{g}}_{k}} \right\rbrack(x){\mathbb{d}{\lambda(x)}}}}}} = {{\frac{1}{\lambda\left\{ U \right\}}{\int_{U}{\left\lbrack {\chi_{({{- \infty},t}\rbrack} \circ g} \right\rbrack(x){\mathbb{d}{\lambda(x)}}}}} = {\left\lbrack {T^{\{ u_{i}\}}g} \right\rbrack(t)}}}$ for all t. That is, we have

${\lim\limits_{k->\infty}{T^{\{ u_{i}\}}{\underset{\_}{g}}_{k}}} = {T^{\{ u_{i}\}}{g.}}$ Since [T^({u) ^(i) ^(}) g _(k)](t−τ)·f_(η)(τ)≦f_(η)(τ) and f_(η) is integrable, the dominated convergence theorem may used to show that

${\lim\limits_{k->\infty}{\left\lbrack {\left\lbrack {T^{\{ u_{i}\}}{\underset{\_}{g}}_{k}} \right\rbrack*f_{\eta}} \right\rbrack(t)}} = {{\lim\limits_{k->\infty}{\int_{- \infty}^{\infty}{\left\lbrack {T^{\{ u_{i}\}}{\underset{\_}{g}}_{k}} \right\rbrack{\left( {t - \tau} \right) \cdot {f_{\eta}(\tau)}}{\mathbb{d}\tau}}}} = {{\int_{- \infty}^{\infty}{\left\lbrack {T^{\{ u_{i}\}}g} \right\rbrack{\left( {t - \tau} \right) \cdot {f_{\eta}(\tau)}}{\mathbb{d}\tau}}} = {\left\lbrack {\left\lbrack {T^{\{ u_{i}\}}g} \right\rbrack*f_{\eta}} \right\rbrack(t)}}}$ for all t. Lastly, we evaluate (2.42) as k→∞ to conclude that

$\begin{matrix} {{\lim\limits_{k->\infty}{T^{\{ u_{i}\}}{\underset{\_}{h}}_{k}}} = {{\lim\limits_{k->\infty}{\left\lbrack {T^{\{ u_{i}\}}{\underset{\_}{g}}_{k}} \right\rbrack*f_{\eta}}} = {\left\lbrack {T^{\{ u_{i}\}}g} \right\rbrack*{f_{\eta}.}}}} & (2.45) \end{matrix}$ Similar derivations show that

$\begin{matrix} {{\lim\limits_{k->\infty}{T^{\{ u_{i}\}}{\overset{\_}{h}}_{k}}} = {{\lim\limits_{k->\infty}{\left\lbrack {T^{\{ u_{i}\}}{\overset{\_}{g}}_{k}} \right\rbrack*f_{\eta}}} = {\left\lbrack {T^{\{ u_{i}\}}g} \right\rbrack*{f_{\eta}.}}}} & (2.46) \end{matrix}$ Thus, by taking the limit k→∞ in (2.44), we can conclude that the limit

${T^{\{ u_{i}\}}h} = {\lim\limits_{n->\infty}{T_{n}^{\{ u_{i}\}}h}}$ exists and [T ^({u) ^(i) ^(}) h](t)=([T ^({u) ^(i) ^(}) g]*f _(η))(t) a.s. for all t. Moreover, notice that since both T^({u) ^(i) ^(})h and [T^({u) ^(i) ^(})g]*f_(η) are distribution functions bounded by 1, we have that for all t, |[T ^({u) ^(i) ^(}) h](t)−([T ^({u) ^(i) ^(}) g]*f _(η))(t)|≦2. Therefore, by using Lebesgue's bounded convergence theorem, we may also conclude that [T ^({u) ^(i) ^(}) h](t)=([T ^({u) ^(i) ^(}) g]*f _(η))(t) inL _(p)(Ω)−norm,1≦p<∞, for all t, which completes the proof. 2(e) Estimation of Radiometry and Geometry in the Presence of Noise

With the results obtained in the last section we may return to discuss the original joint geometric-radiometric estimation problem in the presence of noise:

Given the known function g and a single measurement (observation) h of the form h(x)=[Q∘g∘A](x)+η(x)  (2.47) for all x∈supp{Q∘g∘A}, find an estimate for the left-hand composition Q and the affine transformation A.

To proceed, let us assume that g∈R_(c)(R^(m)) and that the random field {η(x):x∈R^(m)} admits a probability density function f_(η) (i.e., the random field {η(x):x∈R^(m)} has an absolutely continuous probability distribution).

According to Lemma 4, Q∘g∘A is in R_(c)(R^(m)) as well. Let {u_(i)}_(i=1) ^(∞) be a λ-u.d. sequence in supp{Q∘g∘A}.

In this setting, Theorem 7 may be employed to obtain T ^({u) ^(i) ^(}) h=[T ^({u) ^(i) ^(})(Q∘g∘A)]*f _(η) a.s.  (2.48) Since (Q∘g∘A)∈R_(c)(R^(m)), according to Lemma 2 we have T ^({u) ^(i) ^(})(Q∘g∘A)=T(Q∘g∘A). Therefore, Corollary 2 may be invoked to obtain T(Q∘g∘A)=[Tg]∘Q ⁻¹. Thus, (2.48) becomes T ^({u) ^(i) ^(}) h=([Tg]∘Q ⁻¹)*f _(η) a.s.  (2.49) Hence, The following corollary may now be stated: Corollary 8. Let H(t)=[T^({u) ^(i) ^(})h](t) and G(t)=[Tg](t). Then, for all t∈R the following relation almost surely holds H(t)=[(G∘Q ⁻¹)*f _(η)](t).  (2.50)

The stochastic relation expressed in (2.21) describes a noisy observation about a template deformed both geometrically and radiometrically. Using the transformation T we have (almost surely) “converted” this stochastic relation to anew deterministic functional relation. This relation is expressed by a right-hand composition Q⁻¹ (i.e., a deformation of coordinates), followed by the action of an LTI system. Interestingly, the impulse response of this LTI system is given by f_(η), which represents the statistics of the random field η.

In fact, notice that Corollary 8 is a generalization of Corollary 2 (obtained for the noiseless case). Indeed, in the case where η vanishes, f_(η)=δ (in the sense of an atomic measure concentrated in zero) and thus (2.50) reduces to the result obtained in Corollary 2. Hence, in order to estimate the left-hand composition Q, the original stochastic registration problem can be replaced, with probability one, by a “new” deterministic one.

From a “classic” estimation theory point of view, this deterministic problem has the form of a deconvolution problem. Solution of the deconvolution problem reduces (2.50) to the form (1.6) derived for the noise-free case. Thus, as in the noiseless case, the original problem may then be solved in two stages (section 1). Also note that (2.50) implies robustness of the noiseless algorithm when applied to a measurement of high signal to noise ratio. The high SNR case can be characterized by a probability density function f_(η) that approximates the Dirac delta function. Thus, in high SNR, (2.50) naturally reduces to the noiseless equation (1.6), and Q may be estimated as before, as we demonstrate in section 3.

On the other hand, one may ask whether it is necessary to employ deconvolution in order to solve (2.50) for the unknown transformation Q⁻¹? Possibly, the latter may be “accessed” without explicitly restoring the function (G∘Q⁻¹).

In fact, this question turns out to be extremely important, especially in practice. As f_(η) is the probability density function of the noise process η, it is not bounded from below and thus an inverse filter does not exist. Moreover, exact equality in (2.50) is only obtained asymptotically, whereas in practice small perturbations may occur (see section 1 for more details). Hence, any attempt to perform deconvolution (e.g., pseudo-inverse, truncated inverse, etc.) will result in severe instabilities, leading to model mismatches and large estimation errors.

Nevertheless, as we show in the following, with fairly weak assumptions, it is possible to solve (2.50) for the unknown transformation Q⁻¹ without explicitly restoring the function (G∘Q⁻¹) by means of deconvolution.

In order to do so, we exploit the monotone nature of (2.50). Namely, since H and G are distribution functions, they are invertible (possibly in a weak sense, e.g., their pseudo-inverse may be defined). We shall assume that the function G is invertible on its domain of variation; that is, that G|_(0<G<1) is a continuous strictly increasing function. Throughout, restrictions to the domain of invertibility will be omitted in order to simplify the notations.

The method we propose is based on a duality property of left and right compositions applied to monotone functions. More explicitly, since G is invertible, there exists an invertible function W such that G∘Q ⁻¹ =W∘G.  (2.51) Namely, set W(t)=G(Q⁻¹(G⁻¹(t))). In other words, as demonstrated in FIG. 6, since G is increasing, any (continuous and increasing) deformation of its domain (right hand composition with Q⁻¹) may be equivalently reproduced by applying a non-linear mapping to the range of G (left hand composition with W). Hence, instead of recovering (estimating) Q it is sufficient to recover W.

To proceed, we endow W with a linear model. W is in L₂((0,1)) and thus can be represented as

$\begin{matrix} {{{W(t)} = {\sum\limits_{i}{b_{i}{e_{i}(t)}}}},} & (2.52) \end{matrix}$ where {e_(i)} is a countable basis of L₂((0,1)). Hence, (2.51) may be rewritten as

$\begin{matrix} {{{G \circ Q^{- 1}} = {{\left( {\sum\limits_{i}{b_{i}e_{i}}} \right) \circ G} = {\sum\limits_{i}{b_{i}\left( {e_{i} \circ G} \right)}}}},} & (2.53) \end{matrix}$ where the last equality is due to associativity. Therefore, using (2.53) we find that (2.50) becomes

$\begin{matrix} \begin{matrix} {H = {\left( {G \circ Q^{- 1}} \right)*f_{\eta}}} \\ {{= {{\left( {\sum\limits_{i}{b_{i}\left\lbrack {e_{i} \circ G} \right\rbrack}} \right)*f_{\eta}} = {\sum\limits_{i}{b_{i}\left( {\left\lbrack {e_{i} \circ G} \right\rbrack*f_{\eta}} \right)}}}},} \end{matrix} & (2.54) \end{matrix}$ where the last equality holds under the assumption of W admitting a finite expansion in (2.52).

That is, the functional equation (2.50) in Q⁻¹ may be reformulated as a linear parametric problem. Equation (2.54) may easily be solved for the unknown coefficients {b_(i)}; as (e_(i)∘G) may be easily calculated and (2.54) holds for all t, a linear system of equations may be constructed by choosing different t's.

Namely, let M denote the model order of the expansion of W in (2.52). Then for t₁,t₂, . . . , t_(N)∈supp{(H|_(0<H<1))⁻¹} the following linear system is obtained

$\begin{matrix} {\begin{bmatrix} \begin{matrix} {H\left( t_{1} \right)} \\ \vdots \end{matrix} \\ {H\left( t_{N} \right)} \end{bmatrix} = {{\begin{bmatrix} {\left( {\left\lbrack {e_{1} \circ G} \right\rbrack*f_{\eta}} \right)\left( t_{1} \right)} & \ldots & \left( {\left\lbrack {e_{M} \circ G} \right\rbrack*{f_{\eta}\left( t_{1} \right)}} \right. \\ \vdots & \; & \vdots \\ {\left( {\left\lbrack {e_{1} \circ G} \right\rbrack*f_{\eta}} \right)\left( t_{N} \right)} & \ldots & \left( {\left\lbrack {e_{M} \circ G} \right\rbrack*{f_{\eta}\left( t_{N} \right)}} \right. \end{bmatrix}\begin{bmatrix} \begin{matrix} b_{1} \\ \vdots \end{matrix} \\ b_{M} \end{bmatrix}}.}} & (2.55) \end{matrix}$ Employing linear least squares (N≧M) shall then provide us with an estimate for the unknown coefficients {b_(i)}.

Once W has been determined, in terms of its parameters {b_(i)}, Q may be directly calculated by noticing that Q(t)=G ⁻¹(W ⁻¹(G(t))).  (2.56) Having estimated Q, we can now return to the original problem (2.21) in order to estimate the geometric transformation A. Equation (2.21) may be rewritten as

$\begin{matrix} {h = {{\underset{\underset{known}{︸}}{\left( {Q \circ g} \right)} \circ A} + {\eta.}}} & (2.57) \end{matrix}$ Hence, the problem reduces to a strictly geometric registration problem in the presence of additive measurement noise. In section 1(b) we have outlined an explicit solution method to such registration problem in the noiseless case; it was based on a non-linear method for constructing linear constraints in the unknown parameters of the affine transformation A. The same method may also be applied in the case of noisy measurements; in R. Hagege and J. M. Francos, Parametric estimation of multi-dimensional affine transformations in the presence of noise: a linear solution, In Proc. IEEE/SP 13th Workshop on Statistical Signal Processing, pages 55-58, 2005, a stochastic estimation problem of the type (2.57) is analyzed, and a method for constructing unbiased optimal constraints is derived; again, the solution is obtained by simply solving a linear system of equations (or more accurately by solving an over-determined system of constraints as a linear least squares problem). 3. Experimental Results and Summary

This section presents experimental results obtained using the methods proposed in sections 1 and 2. The applicability and performance of the proposed methods are demonstrated using synthetic and natural data.

3(a) Implementation Remarks

Throughout, we have considered images as functions over a continuous domain (coordinates). Due to the inherent physical properties of the problem, it is natural to model and solve it in the continuous domain. Inherently, the mapping A of R^(m) into itself is of a continuous nature, as is the physical phenomenon of geometric deformation of real-life objects it represents. Thus, if we impose a discrete model (e.g., x∈Z^(m)), we find that, in general, the natural A to consider is incompatible (as for “almost all” x∈Z^(m), A(x)∉Z^(m)). In practice, in order to apply the proposed methods to digital images we utilized an approximated discrete form of each expression, by a straightforward replacement of the integrals by their corresponding finite sums.

For example, in practice, the sample distribution transformation T, defined in (1.3), is replaced with the following discrete approximation:

$\begin{matrix} {{\lbrack{Th}\rbrack(t)} = {\frac{\#\left\{ {\left( {i,j} \right):{0 < {h\left( {i,j} \right)} \leq t}} \right\}}{\#\left\{ {\left( {i,j} \right):{{h\left( {i,j} \right)} > 0}} \right\}}.}} & (3.58) \end{matrix}$ [Th](t) is then only calculated on the finite range of h (e.g., t=0,1, . . . , 255 in the common case of 8-bit images). Other expressions are similarly discretized.

Obviously, using such discrete approximation introduces errors and model mismatches. However, the high resolution of typical (optical) digital imaging allows for these errors to be reasonably small, and satisfactory estimation results are obtained. Moreover, a rigorous analysis of the errors introduced in the estimation of the geometric deformation due to sampling and quantization (in (1.12) and (1.18)) is possible, and methods for minimizing these errors are available. Since the problem of joint geometric-radiometric estimation has been decoupled into two strictly geometric estimation problems (section 1(b)(iii)), these methods may be directly exploited to improve upon the performance of the joint estimation algorithm proposed in this disclosure.

In the computational aspect, carefully going through the derivations above shows that the overall complexity of the proposed joint geometric-radiometric estimation method is O(N²) for an image of dimensions N×N. This is due to the fact that the estimation procedure is essentially comprised of pointwise operations and summations applied to, at most, the image pixels themselves, and the solution of low-dimensional linear systems of equations; other computations (such as these employed to the sample distributions) are computationally negligible for large N.

3(b) Noiseless vs. Noisy—Radiometry Estimation Performance Comparison

We have proposed a method for estimating the unknown non-linear deformation Q in the presence of additive noise. On the other hand, we have also shown the robustness of the method derived for the noiseless case (i.e., the method presented that does not take the noise into account) in high SNR. It is then natural to compare the performance of both estimation schemes.

A simple comparison has been conducted in the estimation of the radiometric non-linearity Q. The template image, g, was taken to be a gray-scale version of the “flaming fractal” used in the previous example. Sequences of Monte Carlo experiments were performed at various SNRs (−10 dB÷80 dB). The observations, h, were generated by applying to g a non-linear mapping of the intensities Q(t)=t² followed by the addition of a zero mean white Gaussian noise with the appropriate variance.

As an example, FIG. 7 shows a sample template and observation at −3 dB. The radiometric deformation Q and its estimates are shown in FIG. 8.

The radiometry estimation errors (MSE) of both methods are shown in FIG. 9. Clearly, the method designed for the noisefree case outperforms at high and moderate SNRs. This is due to the numerical overhead of the estimator designed for the additive noise model and the errors it introduces; for example, calculating (2.56) in practice involves the inversion of sampled functions, which largely account for the degraded performance. The noisefree solution method, which is much less computationally demanding, may therefore be used even in the presence of noise. However, in cases of low SNR (less than 10 dB), taking the noise into account is crucial and the estimator designed under the noisy observation assumption outperforms.

3(c) Overall Performance—Noiseless Method

We examine the overall performance and robustness of the joint estimation method derived under the noiseless model assumption, in the case where in fact the observations are noisy, but the SNR is high. As mentioned earlier, this method is much less computationally demanding, and thus, may be more appealing to utilize.

The experiment was conducted in the same setting as the example above. Sequences of Monte Carlo experiments were performed at various SNRs (20 dB÷45 dB). The observations h were generated as above, and a zero mean white Gaussian noise with the appropriate variance was added.

Remark 1. On an image with 256 intensity levels (8-bit) per channel, the additive white Gaussian noise was chosen to have an STD equivalent to 0.5÷15 levels. In experiments, well-exposed images taken with several digital cameras (Nikon Coolpix 8700, Canon S2ISand Canon EOS-1dMkIII), have exhibited an inherent noise with STD equivalent to 0.5÷2 levels.

Remark 2. The SNR was calculated as the RMS SNR across the 3 color channels.

As before Q_(R), Q_(G) and Q_(B) were individually estimated by applying the proposed algorithm to each of the image channels. As expected, in lower SNR the mismatch of the algorithm derived for the noisefree case is no longer negligible, and a slight bias in the estimation of Q_(R), Q_(G) and Q_(B) is introduced. Nevertheless, estimation errors (MSE) were quite small (≈10⁻⁵ in the low SNR experiments). It should be noted that the estimation variance was practically zero, in correspondence with the strong consistency of the left hand composition estimation procedure, expressed in (2.50); hence, the MSE is mostly comprised of a deterministic bias, due to the mismatch between the no-noise assumption of the solution and the existence of noise in the actual observations.

Next, the estimation procedure for the affine deformation A outlined above was applied. The estimation errors (MSE, in the sense previously defined) are shown in FIG. 10. As before, the estimation variance was practically zero; hence, again the MSE is mostly comprised of a deterministic bias, due to the imperfect estimation of the radiometric deformation.

In order to illustrate the effects of quantization on the performance, we repeated the Monte Carlo experiment, this time in a more “realistic” setting. We requantized the observation image h, to 256 uniformly spaced levels. The estimation results of the affine deformation are also shown in FIG. 10. As expected, results are slightly worse than in the continuous (unquantized) case, yet not significantly.

Interestingly, somewhat better results are exhibited at SNR≈30 dB. This can be informally explained as follows: the entire estimation method strongly depend on the sample distribution. Obviously, this distribution is severely corrupted by quantization. In a sense, the perturbations introduced by the noise compensate for the loss of information caused by the requantization, leading to a better approximation of Th, the “empirical distribution” of h. As mentioned above, further discussion of the implications of requantization is out of scope of this disclosure.

3(d) Concluding Experiment

As a concluding experiment, we tested the applicability of the proposed method to real optical images. Images of a doll's head were acquired using a computer controlled digital camera (Canon S2IS). The acquired RGB images were of dimensions 2592×1944 at 8-bit per channel. A total of over 10,000 images were collected at different combinations of geometry (rotation angle) and radiometry (illumination power and shutter speed). Sample images are shown in FIG. 11 and FIG. 12.

A computer controlled motorized arm has rotated the doll's head in a plane approximately perpendicular to the camera's optical axis; rotation angles ranged from −150 to 150 degrees. The illumination power and camera's shutter speed were also controlled; illumination power ranged from 35% to 100%, and shutter speed ranged from 1/1600 to 0.6 seconds. Changing the illumination power and shutter speed globally affect intensity levels in the acquired image, where overall measurement non-linearities are inherently introduced by the camera. Moreover, changing the illumination power also alters the light's spectrum, possibly intensifying such global non-linear effects. In practice, we separately associated a non-linear mapping with each of the image color channels, that is, Q_(R), Q_(G) and Q_(B) as in the previous examples.

To validate our modeling assumption on the radiometric model, we examined pairs of images taken at the same rotation angle, but with different illumination power and shutter speed settings (see FIG. 12). We then plotted the joint histogram (comparagram) of each pair (for each color channel). The joint histogram of two images is defined as the matrix whose (n,m) entry is the number of pixels that simultaneously assume the value n in the first image and m in the second one. Recall that each pair of images is geometrically registered; hence, had there been a non-linear function Q radiometrically relating each pair, the joint histogram of the pair should exactly follow the graph of Q. Conversely, if the joint histogram of such pair seems to follow a graph (a curve) then there is (approximately) a global non-linear intensity mapping relating the pair. Finally, we applied the method derived above to estimate Q_(R), Q_(G) and Q_(B), and overlaid the estimates on the joint histograms of each channel (see FIG. 13). The results strongly suggested that the type of global radiometry considered in this paper is indeed a good approximation to the radiometric effects evident in the collected data.

Finally, we used the joint estimation method proposed in this paper to estimate the rotation angle. First, a template image has been selected (see the marked image in the center of FIG. 11). For each image in the dataset, the radiometric functions Q_(R), Q_(G) and Q_(B) were estimated, and a radiometric-registered image was calculated. Finally, the affine estimation procedure was applied, and the corresponding rotation angle calculated. The overall angular estimation had a bias of 3.2·10⁻⁶ degrees with a standard deviation of 1.41 degrees.

Furthermore, the overall estimation procedure has shown substantial stability to various real-life model mismatches evident in this experiment; among such are projective geometry (as we could not perfectly align the camera and the three dimensional object), non-uniform illumination (due to the light source), shadows, occlusions and other un-modeled perturbations (such as movements of the doll's hair). On the other hand, an attempt to directly use the geometry estimation procedure on the original data has failed.

3(e) Summary and Concluding Remarks

In the previous sections we have proposed a novel method for jointly estimating the geometric and radiometric deformations relating two observations on the same object. The case of a geometric affine transformation relating the images' coordinate systems, and a non-linear function relating the intensities of the two images has been modeled.

In section 1, we derived a solution for the noiseless case. We have shown that by using the sample distribution transformation T, the joint estimation problem may be decoupled into two simpler problems in the unknown radiometry and unknown geometry. More specifically, we have shown that the original high-dimensional non-linear non-convex search problem of simultaneously recovering the geometric and radiometric deformations can be represented by an equivalent sequence of two linear systems. A solution of this sequence yields an exact, explicit and efficient solution to the joint estimation problem, regardless of the deformations' magnitudes. In Section 2, we have further extended the solution to the case of noisy measurements. As in the noiseless case, we have derived a method for replacing the original problem with an almost surely equivalent sequence of two linear systems. Moreover, we have shown the robustness of the method derived for the noiseless case in the presence of noise for high SNR. Numerical performance analysis and synthetic examples as well as experiments with natural images have been presented in section 3.

The estimations described in this discussion may be implemented with any images, regardless of how they are captured. The estimations may be made using any computer hardware and software programs capable of implementing and performing the required calculations.

While certain representative embodiments and details have been shown for purposes of illustrating the invention, it will be apparent to those skilled in the art that various changes in the methods and apparatus disclosed herein may be made without departing from the scope of the invention which is defined in the appended claims. 

1. A method for estimating the joint radiometric and geometric deformations of an image of an object using a radiometry-first approach comprising: (a) analyzing a first observation h of an image, about a known m-dimensional signal g, of the form h(x)=Q(g(A(x)))+η(x)=Q∘g∘A+η, where Q is a non-linear mapping of the intensities (radiometry), A is an affine (geometric) transformation, and η is an additive noise; (b) estimating the joint radiometric and geometric deformations of the first observation without limitations on the magnitude of the geometric or radiometric deformations; and (c) analyzing the deformations of the first observation in comparison to a second observation of an object to determine whether the first observation and the second observation are of the same object; wherein (i) the sample distribution transformation T is invariant under right-hand affine compositions: T(g∘A)=Tg, and T(W∘g)=[Tg]∘W⁻¹ for any strictly increasing continuous function W:R→R such that W(0)=0; (ii) applying T to the (noiseless) relation h=Q∘g∘A, to obtain Th=[Tg]∘Q⁻¹; (iii) using the transformation T to convert a functional relation expressed by a left-hand composition (“radiometric deformation”) into a new functional relation expressed by a right-hand composition (“geometric deformation”) so equation H(t)=[G∘Q⁻¹](t)=G(Q⁻¹(t)) describes a new, one-dimensional, “time-domain” registration problem and any suitable (parametric or non-parametric) registration method may be used to recover (estimate) Q⁻¹; (iv) producing linear constraints on Q, enabling the explicit recovery of Q⁻¹ in terms of solving a linear system of equations, as follows: ${\int_{R}^{\;}{{wo}\overset{\sim}{H}}} = {\left\lbrack {\int_{R}^{\;}{{e_{1}\left( {{wo}\overset{\sim}{G}} \right)}L{\int_{R}^{\;}{e_{M}\left( {{wo}\overset{\sim}{G}} \right)}}}} \right\rbrack\ \begin{bmatrix} b_{1} \\ \vdots \\ b_{M} \end{bmatrix}}$ (v) once the radiometric deformation Q (or equivalently its inverse Q⁻¹) is recovered, rewriting h=Q∘g∘A, as ${h = {\underset{\underset{known}{︸}}{\left( {Q \circ g} \right)} \circ A}};$ and (vi) reducing the problem to a strictly geometric problem where A is recovered.
 2. A method for estimating the joint radiometric and geometric deformations of an image of an object using a geometry-first approach comprising: (a) analyzing a first observation h of an image, about a known m-dimensional signal g, of the form h(x)=Q(g(A(x)))+η(x)=Q∘g∘A+η, where Q is a non-linear mapping of the intensities (radiometry), A is an affine (geometric) transformation, and η is an additive noise; (b) estimating the joint radiometric and geometric deformations of the first observation without limitations on the magnitude of the geometric or radiometric deformations; and (c) analyzing the deformations of the first observation in comparison to a second observation of an object to determine whether the first observation and the second observation are of the same object; wherein (i) the sample distribution transformation T is invariant under right-hand affine compositions: T(g∘A)=Tg, and T(W∘g)=[Tg]∘W⁻¹ for any strictly increasing continuous function W:R→R such that W(0)=0; (ii) defining Rh=[Th]∘h−[Th](0), and letting H(x)=[Rh](x) and G(x)=[Rg](x), then, for all x∈R^(m) the following relation holds H(x)=G(A(x)), whereby R has converted the joint problem h=Q∘g∘A, in the unknowns Q and A, to a “new” problem in a single unknown, A, and by applying the operator R we have eliminated the left composition Q, representing the radiometric deformation, in the basic relation h=Q∘g∘A, whereby H(x)=[G∘A](x)=G(A(x)) describes a new, strictly geometric, affine registration problem, and a registration method may be used to recover (estimate) A, to produce a linear solution for A; and (iii) once the geometric transformation A is determined, h=Q∘g∘A, may be rewritten as ${h = {Q \circ \underset{\underset{known}{︸}}{\left( {g \circ A} \right)}}},$ whereby the problem reduces to a strictly radiometric problem, where Q is recovered.
 3. A method for estimating the joint radiometric and geometric deformations of an image of an object using a parallel registration approach comprising: (a) analyzing a first observation h of an image, about a known m-dimensional signal g, of the form h(x)=Q(g(A(x)))+η(x)=Q∘g∘A+η, where Q is a non-linear mapping of the intensities (radiometry), A is an affine (geometric) transformation, and η is an additive noise; (b) estimating the joint radiometric and geometric deformations of the first observation without limitations on the magnitude of the geometric or radiometric deformations; and (c) analyzing the deformations of the first observation in comparison to a second observation of an object to determine whether the first observation and the second observation are of the same object; wherein (i) the sample distribution transformation T is invariant under right-hand affine compositions: T(g∘A)=Tg and T(W∘g)=[Tg]∘W⁻¹ for any strictly increasing continuous function W:R→R such that W(0)=0; (ii) h=Q∘g∘A, is symmetrically decoupled into two strictly geometric problems in each of the unknowns [Th]=[Tg]∘Q⁻¹, [Rh]=[Rg]∘A, so a solution to the joint problem is obtained by separately solving the latter two problems for the unknowns Q⁻¹ and A, respectively; and (iii) explicit parametric solution methods are derived.
 4. The method of claim 1, wherein the estimation is in the presence of noise, further comprising: (a) letting H(t)=[T^({u) ^(i) ^(})h](t) and G(t)=[Tg](t); and (b) using ${H = {\left\lbrack {\left( {G \circ Q^{- 1}} \right)*f_{\eta}} \right\rbrack = {\sum\limits_{i}^{\;}\;{b_{i}\left( {\left\lbrack {e_{i} \circ G} \right\rbrack*f_{\eta}} \right)}}}},$ where f_(η) is the probability density of the noise, solving this linear system for the coefficients b_(i), to obtain $h = {{\underset{\underset{known}{︸}}{\left( {Q \circ g} \right)} \circ A} + {\eta\mspace{14mu}\ldots}}$
 5. The method of claim 2, wherein the estimation is in the presence of noise, further comprising: (a) letting H(t)=[T^({u) ^(i) ^(})h](t) and G(t)=[Tg](t); and (b) using ${H = {\left\lbrack {\left( {G \circ Q^{- 1}} \right)*f_{\eta}} \right\rbrack = {\sum\limits_{i}^{\;}\;{b_{i}\left( {\left\lbrack {e_{i} \circ G} \right\rbrack*f_{\eta}} \right)}}}},$ where f_(η) is the probability density of the noise, solving this linear system for the coefficients b_(i), to obtain $h = {{\underset{\underset{known}{︸}}{\left( {Q \circ g} \right)} \circ A} + {\eta\mspace{14mu}\ldots}}$
 6. The method of claim 3, wherein the estimation is in the presence of noise, further comprising: (a) letting H(t)=[T^({u) ^(i) ^(})h](t) and G(t)=[Tg](t); and (b) using ${H = {\left\lbrack {\left( {G \circ Q^{- 1}} \right)*f_{\eta}} \right\rbrack = {\sum\limits_{i}^{\;}\;{b_{i}\left( {\left\lbrack {e_{i} \circ G} \right\rbrack*f_{\eta}} \right)}}}},$ where f_(η) is the probability density of the noise, solving this linear system for the coefficients b_(i), to obtain $h = {{\underset{\underset{known}{︸}}{\left( {Q \circ g} \right)} \circ A} + {\eta.}}$ 